Analysis of Gene Expression Data

University of the Philippines Manila (UPM) invited us to talk about our research in Bioinformatics. I will discuss our research on visualization of yeast gene expression data.

Advertisements

Review of Linear Algebra for Machine Learning

Data is often represented by vectors with d scalar features

Definition 1. The inner product of two vectors x and y is

Note that   , that is

Definition 2.

The Euclidean Norm of vector x is ||x||

Note that

If ||x|| = 1, then x is normalized

Otherwise, we can normalize x to x’ as follows

Definition 3.

Vector x and y are orthogonal, if their scalar product is zero.

The angle between two vectors x and y is    if

Geometric interpretation is shown below

Note that the orthogonal projection of y onto x is

Introduction to Machine Learning

Machine Learning

is useful when

  1. the pattern exists.
  2. it is difficult to pin down the problem mathematically.
  3. you have a data.

Pattern is an entity vaguely defined that could be given a name

Watanabe

Examples of patterns are palindromes in a sequence, spatial configuration of pixels in character recognition, speech signal in spectrogram, the salary, age, and debt records in credit card applications.

Learning is a process by which parameters of a learning machine are modified through a continuous process of stimulation by the environment in which it is embedded.

 

 

In Figure 1, parameters of the learning machine are tweaked based from the error signal.

Learning Paradigm

  1. Supervised Learning -with the help of a teacher
  2. Unsupervised Learning – with the help of a critic

Why unsupervised learning is important?

It is important because it may lead to a  new pattern, thus leading to knowledge discovery.