Saving Bits

A biologist wants to encode a large genetic sequence consisting of the letters C, G, A, and T.

In his data, C and G were the most common letters, so he chose the encoding C -> 1, G -> 0, A -> 00, T -> 01. For example, CCGT is encoded as 11001.

Later, the biologist noticed 11001 in his codebase. He concluded that it represented CCGT. Is he correct?

The following code allows you encode a DNA string. Try running the code, and then change the string in line 12 to see if you can find other strings with the same encoding!

Sign up or Log in to access our interactive code editor.


Bonus 1: Can you describe a better encoding scheme?

Bonus 2: Can you write code that returns all possible genetic sequences from a given encoding?

×

Problem Loading...

Note Loading...

Set Loading...