Evolutionary Computation - Artificial Intelligence - Lecture Notes | CPSC 5185U, Study notes of Computer Science

Material Type: Notes; Class: Artificial Intelligence; Subject: Computer Science; University: Columbus State University; Term: Fall 2009;

Typology: Study notes

Pre 2010

Uploaded on 08/04/2009

koofers-user-czl-1
koofers-user-czl-1 🇺🇸

10 documents

1 / 6

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
Chapter 7 – Evolutionary Computation
In this chapter, we discuss another method of mimicking natural processes in our design of
computer software. This time, we discuss evolutionary computation, of which genetic
algorithms are one of the major components.
There are two simple requirements for a problem to be solvable by genetic algorithms.
1) That its arguments be expressible as binary strings, and
2) One can always compare two partial solutions and determine which is better.
This second requirement eliminates many of the more interesting problems.
One consequence of the first requirement is that every partial solution to the problem be
encoded by a binary string of the same length; denote this length by M. Consider a problem
in which the solutions can be encoded as 24-bit strings. Two possible solutions would be.
111100001111000011110000, and
111111000000111111000000.
Genetic algorithms function by applying two basic operations to partial solutions in order to
find better partial solutions. These operations might be called mutation and crossing. The
process is as follows: one applies the operation to single partial solutions or pairs of partial
solutions and then evaluates the resulting partial solutions. If the newly generated partial
solutions are better than the previous the new ones are kept; otherwise they are likely
discarded. Genetic algorithms apply processes seen to operate well in nature: changes are
made randomly and the better ones kept.
Mutation
Mutation refers to taking a partial solution and changing one or more bits. In common
practice, only one of the bits is changed. The bit to be changed is selected randomly.
The sequence 111100001111000011110000
can mutate to 111100001111010011110000. Note only one bit was changed.
Crossing
Crossing (the textbook calls this mating) refers to swapping parts of two partial solutions. It
operates by selecting a bit number at random, splitting each of the two partial solutions at
that point, and then recombining by swapping parts. Consider the following two partial
solutions which are split at the indicated point.
111100001111000011110000 => 11110000111100 0011110000
111111000000111111000000 => 11111100000011 1111000000
Crossing at this point gives
11110000111100 1111000000 or 111100001111001111000000
and 11111100000011 0011110000 or 111111000000110011110000.
Written on 11/28/2020 Chapter 7 Page 1 of 6 pages
pf3
pf4
pf5

Partial preview of the text

Download Evolutionary Computation - Artificial Intelligence - Lecture Notes | CPSC 5185U and more Study notes Computer Science in PDF only on Docsity!

Chapter 7 – Evolutionary Computation

In this chapter, we discuss another method of mimicking natural processes in our design of computer software. This time, we discuss evolutionary computation , of which genetic algorithms are one of the major components. There are two simple requirements for a problem to be solvable by genetic algorithms.

  1. That its arguments be expressible as binary strings, and
  2. One can always compare two partial solutions and determine which is better. This second requirement eliminates many of the more interesting problems. One consequence of the first requirement is that every partial solution to the problem be encoded by a binary string of the same length; denote this length by M. Consider a problem in which the solutions can be encoded as 24-bit strings. Two possible solutions would be. 111100001111000011110000, and

Genetic algorithms function by applying two basic operations to partial solutions in order to find better partial solutions. These operations might be called mutation and crossing. The process is as follows: one applies the operation to single partial solutions or pairs of partial solutions and then evaluates the resulting partial solutions. If the newly generated partial solutions are better than the previous the new ones are kept; otherwise they are likely discarded. Genetic algorithms apply processes seen to operate well in nature: changes are made randomly and the better ones kept. Mutation Mutation refers to taking a partial solution and changing one or more bits. In common practice, only one of the bits is changed. The bit to be changed is selected randomly. The sequence 1111000011110 0 0011110000 can mutate to 1111000011110 1 0011110000. Note only one bit was changed. Crossing Crossing (the textbook calls this mating) refers to swapping parts of two partial solutions. It operates by selecting a bit number at random, splitting each of the two partial solutions at that point, and then recombining by swapping parts. Consider the following two partial solutions which are split at the indicated point. 111100001111000011110000 => 11110000111100 0011110000 111111000000111111000000 => 11111100000011 1111000000 Crossing at this point gives 11110000111100 1111000000 or 111100001111001111000000 and 11111100000011 0011110000 or 111111000000110011110000.

Genetic Algorithms My Way The basic process of genetic algorithms, as discussed on page 220 of the textbook, comes down to a simple process.

  1. Start with a set of trial solutions, generated by some method.
  2. Play with these solutions, generating new ones and evaluating them.
  3. After some time, stop the process and take the best solution. We shall comment on each of the steps presented in the textbook. Rather than using the biological language favored by the textbook and most authors, we shall use the language of binary strings. The student should note that we are saying the same thing, These notes will present a problem not found in the textbook. The problem is based on a common encryption scheme, called XOR encryption. The student who did not sleep through CPSC 2105 (Computer Organization) will note that this problem can be easily solved by a very direct application of Boolean algebra, but here we shall attempt a solution by use of genetic algorithms. It is better to pick an easy problem as an example. The problem to be discussed involves cryptography. The objective of cryptography is to send messages via insecure media (such as by radio broadcast) and have the contents of the message known only to those authorized to receive the message. The sender of a message applies a cryptographic algorithm, often called a cipher to convert a message in readable form, called a plaintext into a disguised form called a ciphertext. The assumption is that the ciphertext can be decrypted , or converted to readable plaintext, only by those who are authorized to receive the message. Most ciphers operate by using a well- known cryptographic algorithm and a secret key , known to only the sender and recipient. Cryptanalysis is the study of ways to retrieve the meaning of a ciphertext without the consent of either the sender or recipient. There are various methods of cryptanalysis, some of which are less elegant than others. Two inelegant methods are stealing the key that enables one to decrypt the messages or “rubber hose cryptanalysis” involving the application of rather nasty methods of persuasion to a human who knows the key. Some of the weaker cryptographic algorithms can be easily broken by use of well-known attacks. An attach itself is an algorithm that can be used to produce the plaintext of a message given enough ciphertext. There are several classes of attacks. A known plaintext attack on a cipher occurs when the hacker attempting to crack the code has both the plaintext of a message and its enciphered form. For many of the weaker cryptographic algorithms, this attack can yield the secret key and thus allow the cryptanalyst to read all future messages. The goal of all modern cryptographic algorithms is to prevent such an attack. There is a caution to be observed here. One should not put too much trust in the security of ciphers. In a well-documented example of American cryptanalysis of Japanese diplomatic ciphers before the attach on Pearl Harbor, the Americans intercepted messages to the Japanese ambassador to the US and had translated plaintext to the US Secretary of State before the Japanese ambassador received his copy from his code room.

Step 3 Select the initial population of solutions X 1 , X 2 , …, X 14. The book suggests a random selection, but one may specify solutions if there is any reason to favor some. Our solution set will begin with four non-random 24-bit partial solutions and 10 partial solutions derived from this set. In other problems, one might have an idea of a beginning solution. Dr. Tim Howard, of the CSU Math Department, is working on a problem called the “Prisoners and Guards Puzzle”. This involves placing 1’s and 0’s on a square array and attempts to optimize the number of 1’s given a specific constraint. Dr. Howard has a method for generating solutions that are known to be good and wants to use genetic algorithms to explore for solutions that are better. It just makes sense to seed the initial solution set with those solutions that are known to be good. Of course, random selection of the solutions for the initial set is as good a method as any. It should be employed if one has no reasons to favor another solution. For this example, I choose to generate non-random solutions for the initial set because that is easier. Here are the four solutions that will form the basis of the initial set of 14.

Note on Random Numbers It is not possible in software to generate a true sequence of random numbers. Most generators, such as the Microsoft function Rnd() supplied with Visual Basic, are pseudo-random generators that generate a sequence of numbers with a long repetition pattern; that is, one can generate a very long sequence before the numbers repeat. This suffices for our purpose. Step 4 Each potential solution is used to encrypt the known plaintext and the ciphertext produced is compared against the known answer. The number of differences is counted. For example, one of our starting keys will be 111111000000111111000000.

Plain Text 010000110101001101010101

This key 111111000000111111000000

Ciphertext produced 101111110101110010010101

Known Ciphertext 001001010100100111100001

Differences * ** * * * * *** *

This solution scores an 11 of possible 24. We would hope that the genetic algorithms quickly find a potential key with a smaller non-negative score.

Steps 5 – 8 At each step, we use the following algorithm to generate the set of 14 potential solutions.

  1. Pick the four potential solutions with the lowest score. At the start we begin with the initial four solutions that are specified.
  2. Copy these four solutions into the new solution set.
  3. Mutate each of these four solutions to produce a new solution. For the first mutation the point will be selected as p = 17. For others, pick a random number p , 1  p  24.
  4. There are six different ways that the four numbers can be made into s 2-set: (1, 2), (1, 3), (1, 4), (2, 3), (2, 4), and (3, 4). Use these six pairings of the four best solutions to form six additional more by crossing at the same point. For the first crossing, the point will be selected as p = 13, afterwards the value will be random. At the beginning of the first iteration, we have a set comprising 4 initially specified solutions, 4 solutions obtained by mutating these solutions, and 6 solutions obtained by crossing these specified solutions. At the beginning of any other iteration, we have a set comprising 4 best solutions from the previous iteration, 4 solutions obtained by mutating these solutions from the previous iteration, and 6 solutions obtained by crossing these solutions from the previous iteration. We then evaluate each solution and pick the four with the lowest score, because those will be closest to the desired key. If any of the potential solutions has fitness value 0, we declare it the solution and stop the iteration. If we do not have a solution, we copy the four best partial solutions for use in the next iteration and discard the rest. We then do steps 5 through 8 over again. Here is the initial set of 14 partial solutions. Some digits underlined only for clarity.

First we have the initial four 111100001111000011110000

The four mutations at p = 17 111100001111000001110000

The six crosses (1, 2) 111100001111011100001111