| ||||||||||||||||||
| ||||||||||||||||||
|
||||||||||||||||||
Using Genetic Algorithm for Parameter EstimationBy Yi Wang
Computer Science Department,Tsinghua University,100084, Beijing, China 1 Introduction1 Genetic algorithm is a method of searching, search for a result equal to or close to the answer of a given problem. In the principle of artificial intelligence, the problem of training a generative mode given a set of data (sometimes called parameter estimation) is a typical searching problem -- search for parameters to adjust the model that generates data the most likely to the given training data. Many solutions have been given for different models. In this note, we show how to estimate parameters of a mixture distribution with example and experiments. 1.1 The Problem and ConceptsIt is called optimization that search for answers that best fit some requirements. There are many solutions to different types of optimization problems, such as conjugate gradient method, primal-dual method, simplex method and many many others. A typical subset of the optimization problems is estimating parameters of a
generative model given a set of data generated form the model. A Generative
model is usually some statistical model that approximate certain data generation
procedure (that is why it is called generative). Some parameters adjusts
gritty details of the data generation process. The parameter estimation (or
model training in AI term) is (1) given a set of data
A famous method for estimate model parameters given a set of partial data or noisy data is maximization expectation algorithm(EM). A fact worthy noticing is: in many problems, why the data is partial is not because it is really partial, but because we try to model it as partial, so EM can be applied. From another aspect, if we model the finite set of parameters as a chromosome string, i.e. each parameter corresponding to a gene in the chromosome. Usually, when the genes are discrete values, like integers, we can find the optimize result with genetic algorithm(GA); when the genes are continuous values, we can find result close to the real value. 2 Estimate Means of Mixture DistributionIn this section, we demonstrate using genetic algorithm for estimating parameters
of a 2-component mixture Gaussian distribution. A mixture distribution with
Which means, for a given observed sample The steps of our experiment contains:
2.1 The ModelIn our experiment, the each component is 1D Gaussian, i.e.
In order to give an intuitive visualization of the problem domain in 3D space, we
reduce the number of parameter to
Thus, our mixture distribution can be written as
As you see, there are only 2.2 Drawing SamplesNow we fix
2.3 Visualize the Problem DomainWith the model type (not the parameters) known, we can compute the likelihood
between the samples
A brute force solution to our problem is to enumerate all possible parameter-pairs
There are two points in this figure need to be noticed,
2.4 Search for Peak with GAFrom figure 2, it is clear that want we want
is the position of one of the two peaks. If we solve this problem with other methods,
such like EM algorithm, we will start from one position, and close to the
peak with each iteration. With GA, we start from a population of starting
positions -- each individual in the population is represented with its chromosome --
a string of genes. In our example, a chromosome is consist of two genes, one for
Figure 3 shows the likelihood figure together with result of each iteration of our GA algorithm. The red line segments shows the process of close to one of the peaks.
A APPENDIXA.1 MATLAB Programs in This ExperimentA.1.1 The 1D Gaussian function
function y = gauss1d(x, mu, sigma2)
% Vector version of 1D Gaussian distribution (normal distribution)
for i=1 : length(x)
y(i) = gauss1d_s(x(i), mu, sigma2);
end
function y = gauss1d_s(x, mu, sigma2)
% Scalar version of 1D Gaussian distribution
k = 1/sqrt(2 * pi * sigma2);
e = - (x-mu)*(x-mu) / (2* sigma2);
y = k * exp(e);
A.1.2 Draw Samples from Gaussian distributionfunction OSS = draw_gauss(N, mu, Sigma) OSS = randn(1,N) * sqrtm(Sigma) + repmat(mu,1,N); A.1.3 Compute the Likelihood for Any Given Parameter
function L = true_likelihood(O, mu1, mu2, sigma)
% Matrix version of the true_likelihood between
% a set of points and a given parameter-pair
L = ones(length(mu1), length(mu2));
for i=1:length(mu1)
for j=1:length(mu2)
Mu = [ mu1(i), mu2(j) ];
L(i,j) = true_likelihood_s(O, Mu, sigma);
end;
end;
function L = true_likelihood_s(O, Mu, sigma)
% Scalar version of the true_likelihood between
% a set of points and a given parameter-pair
T = ones(size(O,1),size(O,2));
L = 0;
for i1=1:2
for i2=1:2
for i3=1:2
for i4=1:2
for i5=1:2
for i6=1:2
T(1) = Mu(i1);
T(2) = Mu(i2);
T(3) = Mu(i3);
T(4) = Mu(i4);
T(5) = Mu(i5);
T(6) = Mu(i6);
pO_T = 1;
for j=1:6
pO_T = pO_T * gauss1d(O(j), T(j), sigma);
end;
pT = 1;
for j=1:6
pT = pT * 0.5;
end;
L = L + pO_T * pT;
end;
end;
end;
end;
end;
end;
A.2 The GA AlgorithmThis program output the best individual of each iteration to MATLAB format, which is used for visualize the GA process in figure 3. A.2.1 The Source Code in C++
//
// This program is a demo for estimate means of a
// mixture distribution containing 2 Gaussians
//
#include <iostream>
#include <vector>
#include <ctime>
#include <cmath>
const double pi = 3.14159265357989;
using namespace std;
// Return float random number in [0,1)
inline double frand()
{
return (float)rand()/(float)RAND_MAX;
}
// The samples drawn from the mixture distribution
// with known parameters
double O[] = {
999999, // no use
1.2083, 0.4473, 0.9893,
-1.0264, -1.0000, -1.1741
};
const size_t _PopSize = 20;
const size_t _ChromLen = 2;
const float _GeneRangeMin = -2.5f;
const float _GeneRange = 5.0f;
const float _SurviveRatio = 0.1f;
const float _ReproduceRatio = 0.5f;
const float _MutateRatio = 0.5f;
const float _MutateDist = 4.0f;
const int _MaxIterNum = 40;
struct t_Individule
{
float chromosome[_ChromLen]; // the parameters to be estimated
float fitness; // the true likehood to given samples
t_Individule()
{
fitness = 0;
for ( size_t i=0; i<_ChromLen; i++ )
chromosome[i] = 0;
}
};
bool better(t_Individule i1, t_Individule i2)
{
return i1.fitness > i2.fitness;
}
ostream & operator << (ostream & ostr, const t_Individule & i)
{
ostr << "fitness: " << i.fitness << '\t'
<< "chromosome: ";
for ( size_t j=0; j<_ChromLen; j++ )
ostr << i.chromosome[j] <<'\t';
return ostr;
}
typedef vector<t_Individule> t_Population;
void InitPopulation(t_Population & pop, t_Population & buf)
{
pop.resize(_PopSize);
buf.resize(_PopSize);
for ( size_t i=0; i<_PopSize; i++ )
{
for ( size_t j=0; j<_ChromLen; j++ )
pop[i].chromosome[j] =
(_GeneRangeMin + _GeneRange) * 0.5 +
(frand()-0.5) * _GeneRange * 0.1;
}
}
double gauss(double x, double mu, double sigma2)
{
double k = 1/sqrt(2 * pi * sigma2);
double e = - (x-mu)*(x-mu) / (2* sigma2);
double y = k * exp(e);
return y;
}
double true_likelihood_s(double * O, double * Mu, double sigma)
{
double * T = new double[7];
double L = 0;
for (int i1=0; i1<2; i1++) {
for (int i2=0; i2<2; i2++) {
for (int i3=0; i3<2; i3++) {
for (int i4=0; i4<2; i4++) {
for (int i5=0; i5<2; i5++) {
for (int i6=0; i6<2; i6++) {
T[1] = Mu[i1];
T[2] = Mu[i2];
T[3] = Mu[i3];
T[4] = Mu[i4];
T[5] = Mu[i5];
T[6] = Mu[i6];
double pO_T = 1;
for (int j=1; j<=6; j++ )
pO_T *= gauss(O[j], T[j], sigma);
double pT = 1;
for (int j=1; j<=6; j++ )
pT *= 0.5;
L += pO_T * pT;
}
}
}
}
}
}
delete [] T;
return L;
}
void CalcFitness(t_Population & pop)
{
for (size_t i=0; i<_PopSize; i++)
{
double t[2] = {
pop[i].chromosome[0],
pop[i].chromosome[1] };
pop[i].fitness = true_likelihood_s(O, t, 0.3);
}
}
void SortOnFitness(t_Population & pop)
{
sort(pop.begin(), pop.end(), better);
}
void Mate(t_Individule & kid,
t_Individule & p1, t_Individule & p2)
{
for ( size_t i=0; i<_ChromLen; i++ )
{
kid.chromosome[i] =
(p1.chromosome[i] + p2.chromosome[i]) /2.0f;
}
}
void Mutate(t_Individule & in)
{
size_t iPos = rand()%_ChromLen;
in.chromosome[iPos] += (frand()-0.5) * _MutateDist;
if ( in.chromosome[iPos] < _GeneRangeMin )
in.chromosome[iPos] = _GeneRangeMin;
else if ( in.chromosome[iPos] > _GeneRangeMin + _GeneRange )
in.chromosome[iPos] = _GeneRangeMin + _GeneRange;
}
int main(int argc, char ** argv)
{
t_Population a, b;
t_Population * pop =&a , * buf =&b, * temp =NULL;
InitPopulation(*pop, *buf);
cout <<"steps = [ \n";
for (int iter=0; iter<_MaxIterNum; iter++ )
{
CalcFitness(*pop);
SortOnFitness(*pop);
cout <<(*pop)[0].chromosome[0] << " "
<<(*pop)[0].chromosome[1] << " "
<<(*pop)[0].fitness << endl;
if ( (*pop)[0].fitness == 0 )
break;
size_t survived = _PopSize*_SurviveRatio;
// Survive
copy(pop->begin(), pop->begin() + survived,
buf->begin());
// Reproduce
for ( size_t i=survived; i<_PopSize; i++ )
{
size_t iP1 = rand() % _PopSize*_ReproduceRatio;
size_t iP2 = rand() % _PopSize*_ReproduceRatio;
Mate((*buf)[i], (*pop)[iP1], (*pop)[iP2]);
if ( rand() < RAND_MAX * _MutateRatio )
Mutate((*buf)[i]);
}
// Swap
temp = pop;
pop = buf;
buf = temp;
}
cout << "];\n";
}
1 You can find in the next section for descriptions of most concepts mentioned in this section. 2
The problem can be solved with EM by considering the
3
How many peaks will have for a
Submitted: 03/10/2004 Article content copyright © Yi Wang, 2004.
|
|
|||||||||||||||||
All content copyright © 1998-2007, Generation5 unless otherwise noted.
- Privacy Policy - Legal - Terms of Use -