?

Log in

No account? Create an account

Previous Entry | Next Entry

beginner's guide to information theory

I'm feeling better tonight. last night I did very poorly on an exam, hence the bitchiness. Today I sat down and looked at some numbers, and realized that since I did pretty well on the first two tests, last night's score (plus the final) don't really mean all that much. I have to average a 55% to keep my B. I could all but not show up and still get a C... which is really what matters, since my company requires a C or better to repay my tuition (the going phrase around the office is "C is for Cash").

With that out of the way, I would like to write a little about what I've been studying this semester: Information Theory. Because even though I hate school right now, a little part of me deep down inside is appreciating the material and even sometimes thinks it might be cool. And who knows, someone else here might think the same thing... I keep trying to explain it all to people, and some of them just immediately smile that "please stop" smile, tell me they can't even begin to understand it, and compliment me for being smart enough to even get past the first chapter. This isn't really a compliment... being smart enough to understand brilliant ideas is pointless if you're not capable of explaining it in layman's terms to anyone new off the street. But I figure maybe it's not me, they just don't want to understand it... math phobia? Other people, mostly ones who are uncomfortable admitting they don't get something, hear me out on it and eventually agree that it is kind of cool. So here's hoping my lj readers are those kinds.

Information theory starts with the idea that we transmit symbols using bits (example, A = 01000001 B = 01000010) because computers talk using 0s and 1s. On a transmission line, we can represent 0s and 1s with things like frequencies or voltage levels (a certain level means 0, another one means 1) or whatever, but either way each bit takes a little bit of time to transmit. You have to send the "1" signal for a while, and the "0" signal for a while, and give the guy on the other end time to process it. Or you're storing these 0s and 1s in a file, that takes up space on your hard drive, and that's annoying too.

So if you start looking at your signal, and notice that certain parts of it repeat a lot, you assign shorter bit sequences to those parts (example, A = 1, B = 0001). Rather than grouping eight bits in a row together every time to figure out what letter you just sent, the receiver at the other end looks for those unique patterns. Huffman Coding is a cool example of this... the wiki explination looks sort of crazy so maybe I'll take that one on in another entry.

Anyway there's an expression we use for the uncertainly of a signal called "entropy" (not like the thermodynamics entropy, in case someone here is into that). If I'm sending you a message, and every single letter happens the same number of times, then you have no way to predict what's going to come next. But if I just send you mostly As and the occasional B, then you can be pretty certain that the next letter you'll get is A. More certainty = less entropy, and less entropy means you can get more out of all this information theory stuff. So we look at ways to reduce all that by using combinations of symbols together (ie, U generally follows Q, we can use that to help us out).

So that was the first day of class, and the rest of the semester was spent looking at probability distribution functions (which are pretty terrible and should be avoided) and how we can use what we know from them to use less bits to transmit the same set of data, all the time, with no distortion or loss.

And in two weeks the class will be over and I will forget all this, and part of me can't wait. you know?

Comments

( 11 comments — Leave a comment )
dwh
Apr. 26th, 2007 04:44 am (UTC)
Sounds like networks.

And actually, networks is mostly applied information theory.

I hate CRCs. They're kinda spiffy on paper, and then stab you in the gut when you try to implement them.

Man, I can't wait to be over with this class, and over with programming. It's so close I can smell it.
spacefem
Apr. 26th, 2007 10:30 pm (UTC)
I've had some networks... it's good times! lots of interlacing and talking about fairness. i'm taking a routing class next semester (probably) that will be a party every day, I'm sure.
dwh
Apr. 27th, 2007 12:19 am (UTC)
Ahh, but I bet you were blessed with a professor who doesn't put you straight to sleep every class. Some people have virtually stopped coming to class because they learn more when they read the textbook on their own time. It's kinda sad.
crisco747
Apr. 26th, 2007 08:15 am (UTC)
I understood that. Hope it makes you feel better!
chezmax
Apr. 26th, 2007 12:41 pm (UTC)
Information Theory is kind of fun. :)

Next you should try wireless engineering... Sooo much stuff.
j0nkatz
Apr. 26th, 2007 02:05 pm (UTC)
You're so hot when you talk binary like that. :D
spacefem
Apr. 26th, 2007 10:28 pm (UTC)
wait until I give it to ya in hex ;)
jume
Apr. 27th, 2007 01:31 am (UTC)
information theory sounds wonderful :o
whatupbitch
Apr. 27th, 2007 04:46 am (UTC)
This does sound wonderful. In fact it sounds like this is a huge part of how file compression works, and I suspect it plays a significant roll in traditional data encryption.

However, quantum encryption is still the coolest thing I've ever seen. I friggin' love it. (See also http://en.wikipedia.org/wiki/Quantum_cryptography.)
aardvarklf
Apr. 27th, 2007 07:40 am (UTC)
What sort of data would have low entropy? Are we talking about a really repetitive music CD here? I'm just trying to understand what low signal entropy means in terms of the kind of data computers deal with.

Interesting stuff, btw. But I'll keep my physics entropy...or rather, I'll keep trying to minimise it... :-)
spacefem
Apr. 28th, 2007 03:12 pm (UTC)
text has low entropy because it results in a lot of repeated letters and sequences of letters... apparently there are all sorts of projects out there where people tried to represent the bible in as few bits as possible. Some images are also good if most of the pixels are around one color or luminantion level.

and there are methods for finding entropy of a continuous source (like music) but it's a lot messier so I haven't dealt with it much.
( 11 comments — Leave a comment )

Latest Month

July 2017
S M T W T F S
      1
2345678
9101112131415
16171819202122
23242526272829
3031     
Powered by LiveJournal.com
Designed by Tiffany Chow