Tuesday, November 24, 2009

Genetic Programming, Self Reproduction



[日本語] [to English]

遺伝的プログラミングと自己複製

遺伝的プログラミング:

遺伝的プログラミング (GP: Genetic Programming) は、遺伝的アルゴリズム (GA: Genetic Algorithms) に基づいています。GA は進化論 (ダーウィンなど) を使い、データ操作によって最適解探索、学習、推論などの機能を実行します。

GA は、まず個体 (あるいは、それを表現するに充分なデータ構造) のグループからスタートします。それぞれの個体は、それぞれ環境適応度を持っています。この適応度は個体によって違います。これらの個体は生殖 (あるいは、再結合) をおこない、次世代を作ります。

生殖の過程では、「適者生存」の原理のように、適応度の高いものが低いものよりもより高い確率で生存します。

GA モデルは、GTYPE (遺伝タイプ) と PTYPE (発現タイプ) の両方を持つことが必要です。GTYPE は、簡単に言えば遺伝子、PTYPE はそれぞれの個体が GTYPE をもとに発現した形質と行動特性 (行動パターン) と言えます。

世代が、(t, t+1, t+2...) というように進むにつれ、グループ全体の環境適応度は高まってゆきます。
生殖をコンピュータでの計算操作でおこなう際には、部分的な遺伝子交換 (crossover: 交差) や突然変異 (ごくまれに、遺伝子の一部をランダムに変更する) が使われています。

遺伝的プログラミングや遺伝的アルゴリズムをうまくモデル化すると、シミュレーション、あるいは進化エンジンなどのさまざまな分野に応用できる可能性があります。



自己複製:

最も簡単な自己複製プログラム (C言語) の例をあげます。これは、自分自身のソースコードを print する(書き出す)だけのものです。

#include
char *s="#include %cchar *s=%c%s%c;main(){printf(s,10,34,s,34,10);return 0;}%c";main(){printf(s,10,34,s,34,10);return 0;}

コンパイルして実行すると、ソースコードを全くそのまま書き出します。

自己複製は、多くのコンピュータサイエンスやプログラム開発者にとって、魅力的な課題の一つです。

自己複製プログラムを作る実用的方法の一つは:

  • まず最初の卵 (egg) になるプログラムを作り、それが他のプログラムを作って実行できるようにする。

  • そのオブジェクトコードを暗号化した文字列にして、親 (mother: 鶏) プログラムに入れる。

  • 親プログラムは、この文字列を解読(暗号化の逆を)して、また新しいプログラムを作る。

  • ....「鶏->卵」のプロセスを繰り返す。

それから、乱数を使って突然変異を生じさせてもいいでしょう。



[English] [to 日本語]

Genetic Programming, Self Reproduction

Genetic Programming:

Genetic programming (GP) is based on genetic algorithms (GA). GA uses the theory of evolution (Darwin, et al) and manipulates data to do best-solution search, learning, and inference.

GA starts from a group of individual creatures (or substantial sets of data). Each individual has its own adaptation level. The level differs. Those individuals do reproduction (or recombination) and create next generation descendants.

During reproduction, those with higher adaptation level have higher probability of survival than those with lower adaptation level - like "survival of the fittest".

GA model should have both GTYPE (gene-type) and PTYPE (pheno-type). GTYPE is basically gene and PTYPE is each creature's resulting characteristics and behavioral patterns based on GTYPE.

As the generation (t, t+1, t+2...) goes forward, the overall adaptation level of the group will improve (become well fitter to the environment).

For reproduction, partial gene-exchange (crossover) and mutation (random change of some part of gene, in some rare cases) are used for computational operation.

If GA/GP is properly modeled, this could be used in various areas - for simulations, and even evolution engines.


Self-Reproduction:

One of the simplest C programs which can reproduce (in this case, print its source code) is shown below:

#include
char *s="#include %cchar *s=%c%s%c;main(){printf(s,10,34,s,34,10);return 0;}%c";main(){printf(s,10,34,s,34,10);return 0;}

When compiled & executed, it prints its source code exactly.

Self-reproduction is an attractive theme for many computer scientists and rogrammers.

One practical way of creating self-reproductive programs would be:

  • Create the first egg program which can create another program and execute it.

  • Then encrypt the object code and put it as a character string into a mother program.

  • A mother program creates a new program by decrypting the character string.

  • .... "Chicken & egg" process continues….

Then we may use random numbers for mutation.

11 comments:

  1. That s an interesting first post, Takushi.
    Although it must be complicated for mere mortals.
    But I do think that genetic algorithms are interesting.

    Do you think that we will be one day, able to create a program that begins with , say a black box like in cell automata and produce human beings ultimately?

    ReplyDelete
  2. oh by the way, i ve made a mistake.
    It s not genetic algorithm, it s generic algorithm.
    What s the difference?

    ReplyDelete
  3. I meant "genetic" - not "generic". Sorry. So I corrected typo. I also modified the C program example part in HTML (in Edit mode) so that the long line can properly fit in the screen.

    ReplyDelete
  4. This comment has been removed by the author.

    ReplyDelete
  5. I added a Japanese translation upfront. I also added intra-post references (jumps) by directly editing HTML so that readers can jump back and forth between Japanese and English.

    英語だけだとわかりにくいかもしれないため、日本語にしたものを最初に入れました。また、投稿をHTMLの状態で編集し、投稿内で自由に[日本語]<->[English]の間でのジャンプができるようにしました。

    ReplyDelete
  6. that s very cool.

    so do you know a practical application for genetic algorithms?

    And also, the question is: if you rely on self-reproduction, can you reach highly complex AI?

    ReplyDelete
  7. Not sure.....
    More than 20 years ago, I wrote a LISP program which modified itself during execution because LISP treated code and data in the same way.
    In the OOP (Object Oriented Programming) languages, this is practically very difficult.
    It is easy to create a Java program using an encrypted string of other Java program (.class file after compilation) and reproduce that program and execute it. But this is far from the goal.
    We may need a new programming language - to support genetic programming & self reproduction.

    ReplyDelete
  8. good idea!

    what about starting creating that new programming language right now?

    ReplyDelete
  9. For the new language, I would like to take good features from LISP, Java, Pascal, Perl, and maybe C++. Good things of LISP is that it treats code & data in the same way so that the program can mutate during execution. Good things about Pascal is that it distinguishes functions & procedures and allows the user to organize those in hierarchical scopes. Perl has built-in hash, sort, pattern atching/replacement/value-extraction for text strings, etc. I still heavily use Perl to quickly write programs with substantial complexity.

    ReplyDelete
  10. I see.

    We must blend all of those languages' good points to erase the bad points...like a genetic blend between one black guy and one white woman...

    The only language i enjoyed a little bit is perl by the way.
    Why? Because of its flexibility and its nonchalance.

    ReplyDelete