Translate

Sunday, December 15, 2013

Scientific Computing: Open Source Software


     There is a great need for software in the scientific community that can simplify and reduce the work required to solve complex mathematical equations. Otherwise manually solving science related problems would take forever and be error-prone. Scientific computing aims to resolve complicated problems in a range of fields including the physical and engineering sciences, finance and economics, medical, social and biological sciences. It can enhance communication of information by creating visual representations of scientific data. The major numerical computing environment and programming-language that most have heard of is MATLAB. Unfortunately MATLAB is proprietary software and thus has a high monetary cost. Fortunately there are open source alternatives that have much, if not all, of the capabilities required for scientific computations.

     SciPy is an open source computing environment built for the Python programming language. The core elements of SciPy are the NumPy and SciPy libraries that include all the algorithms and mathematical tools required for core scientific computing. There are also additional libraries to expand the features of SciPy such as the Matplotlib library which is used to show plots.
 

Here’s a list of some of SciPy’s features and their packages:
• Special Functions (scipy.special)‏
• Signal Processing (scipy.signal)‏
• Fourier Transforms (scipy.fftpack)‏
• Optimization (scipy.optimize)‏
• General plotting (scipy.[plt, xplt, gplt])‏
• Numerical Integration (scipy.integrate)‏
• Linear Algebra (scipy.linalg)‏
• Input/Output (scipy.io)‏
• Genetic Algorithms (scipy.ga)‏
• Statistics (scipy.stats)‏
• Distributed Computing (scipy.cow)‏
• Fast Execution (weave)‏
• Clustering Algorithms (scipy.cluster)‏
• Sparse Matrices* (scipy.sparse)‏

These allow the creation of vast variety of functions required for use by the scientific community. If you are looking for a powerful open source computing environment for scientific computing visit their site at http://www.scipy.org/ and download the software.

Get started with Python and SciPy: Introduction to Scientific Computing

Sunday, December 8, 2013

Computer Graphics: CAPTCHA Image Processing

In 1999, slashdot.com create an online poll to ask the people which graduate school had the best computer science program. This was a big mistake. Both MIT and Carnegie Mellon wrote programs or “bots” that voted for them. As a result the poll became a contest between the voting “bots” where each school ended up with over 20,000 votes while the rest had less than a 1,000 votes. This led to research in preventing such programs and the CAPTCHA was created. CAPTCHA stands for “Completely Automated Public Turing Test to Tell Computers and Humans Apart.” The idea is that the CAPTCHA is a test that humans can pass but computers can't pass with a probability greater than guessing. What does the CAPTCHA have to do with computer image processing?


CAPTCHAs are distorted images which computers can't solve due to the segmentation problem. Computers are actually better than humans at solving fundamental CAPTCHA problems. Yet the computers fail at separating letters from each other, recognizing distorted letters, and understanding the context of each letter. Humans on the other hand excel at recognizing the letters and the resulting words. Computers are not able to recognize distorted letters because there are infinitely many distortions. They are not able to separate letters from each other as well because CAPTCHA images have lines going across the words and confusing background patterns. Thus the computer's CAPTCHA image processing problem is a difficult problem in the field of artificial intelligence. One last interesting thought: CAPTCHA is a program that can generate and grade tests that it itself cannot pass.

Sources:

Sunday, December 1, 2013

Cryptography: TLS/SSL Protocol

     Network interactions require specific protocols for them to take place. These protocols are based around user authentication and confidentiality. Protocols can be used for authentication, confidentiality, or both. The protocols allow you to make secure transactions, application connections, and user connections over non-secure networks. Examples of these protocols are TLS/SSL, IPSec, and Kerberos. I’ll focus this blog on TLS/SSL as that is the protocol most visible to everyone today.


TLS/SSL
     We all use this protocol when we browse the internet because TLS/SSL is the underlying security protocol for HTTPS. The protocol is implemented at the socket layer (to use it applications have to implement it) and is relatively simple. TLS/SSL main purpose is for secure transaction. To purchase an item you want to be sure you are dealing with the real business (authentication), you want your credit card information to be protected (confidentiality and integrity), and the business does not need to authenticate you since all they want is the money (no mutual authentication). Now to the actual steps of the protocol. If you are ready to purchase an item on Amazon then the first step is for you to request a connection with Amazon.  Along with it you will send a list of ciphers that you support and a random nonce (number used once). Amazon will then reply with their certificate, a chosen cipher, and their own random nonce. You then reply with a secret that is encrypted with Amazon’s public key and another encrypted message that is used for integrity check and establishes a session key. Amazon replies with one last message to prove that they were able to decrypt your previous messages. A couple of important parts are the certificate sent by Amazon and the established session key. The certificate prevents a man-in-the-middle attack because it is signed by a certificate authority and your browser will check the certificate signature. If an attacker sends a false certificate the browser will see that it is not signed and gives a warning to you. Unfortunately users can ignore this warning and allow the connection to proceed which allows the man-in-the-middle attack to succeed. This is a flaw in human nature not the protocol. The other important part of the protocol messages is the session key. The session key is a hash of the secret you sent and both of the nonces. Often your browser opens multiple parallel connections to improve performance. The TLS/SSL session are costly but given an existing session new connections are cheap. Thus any number of new connections can be created from the existing key to allow multiple parallel connections.

Book I've been reading: Information Security: Principles and Practice by Mark Stamp.

Sunday, November 24, 2013

Artificial Intelligence: Technological Singularity

The concept of artificial intelligence (AI) has been around as long as the idea of machines and computers. People are fascinated by the idea that it is possible to code software that can “think” to a certain extent. Technologies with AI are all around us but we don't always think of them as AI. This thought can be attributed to all the movies with AI that are far more advanced than today's AI and/or that we have become used to AI being part of the world. Examples of AI currently are robots in car factories, automated customer services, Roomba vacuum cleaner, IBM's Watson, and self-parking cars. Currently the two major AI areas are voice-recognition software and self-driving cars. The major use of AI is to improve efficiency and to help humans with dangerous or difficult tasks. There are smart robots disabling land mines and handling radioactive materials.


As mentioned earlier, the AI technology available today is rather one-dimensional compared to what one can see in movies. AI is only as smart as the code that is uses. I don't think we are nowhere near in creating a truly “intelligent” AI, one that has the capabilities of human thought. Whether self-awareness can ever be achieved in a machine is debatable. One view is that if the Moore's law continues to hold then it's only a matter of time before humans create a machine with superhuman intelligence. This view was brought up by Vernon Vinge and he even went as far as saying that this will occur by the year 2030. If mankind ever develops software that will allow a machine to analyze data, make decisions and act autonomously then we can expect to see machines begin to design and build even better machines. In return, the new machines can build more powerful machines. Once these machines are able to improve themselves humans will become obsolete since the machines will have more intelligence then us, this point is called technological singularity. What will happen then?

Sources: 

Sunday, November 17, 2013

History of Computer Science: Cryptography with Digital Computers

                With the computer revolution came forth more advanced cryptographic techniques that were previously impossible or at the very least not very efficient. In 1948, Claude Shannon started the cryptographic revolution with his paper, A Communications Theory of Secrecy Systems. The published paper crowned Shannon as “The Father of Information Theory” because he applied advanced mathematical techniques to show and prove the security of cryptographic algorithms. The Lucifer cipher developed by Horst Feistel in the 1970’s while working for IBM paved the way for the symmetric key ciphers. By the mid 1970’s the computer revolution was at full strength and it became clear that digital data needed to be secured. At the time cryptography was a field only for the military and the government until the National Bureau of Standards called for cipher proposals. The only serious contender was the Lucifer cipher which the NBS handed to the government experts, the NSA, who modified the Lucifer cipher and created the Data Encryption Standard (DES). With the ever increasing computational powers of the computers over time DES has been replaced by Triple-DES and AES.



                During the same time the symmetric key cryptography was being developed another cryptographic technique was being born, public key cryptography. In 1976, Whitfield Diffie and Martin Hellman published a paper titled New Directions in Cryptography which introduced public key cryptography and one-way functions. Unlike symmetric keys which required the key to be shared before the communication was made, the Diffie-Hellman key exchange allowed making connections without prior key sharing. The one-way functions allowed the public key cryptosystem to flourish because one-way functions are easy to compute in one direction computationally infeasible in the other. The Diffie-Hellman inspired RSA which is stilled used today for public cryptography. RSA was published in 1977 by Ronald L. Rivest, Adi Shamir and Leonard M. Adleman. For internet security PGP was released in 1991 and it is still considered secure today. PGP uses public keys and doesn’t allow the sender to determine the decryption key even if the encryption key is known. Cryptography has become extremely important and will become more important as the power of computers increases along with the growth of digital data and the internet.

More detailed history here and here.
More info on different cryptography systems here.
Book I've been reading: Information Security: Principles and Practice by Mark Stamp.

Saturday, November 16, 2013

History of Computer Science: Cryptography Before Digital Computers

The beginning of cryptography was when humans spoke their first words. Even to this day a language can be considered a form of cryptography because if you don’t know the language another person is speaking you will have no idea what secrets they are talking about. This accounts for the use of written language as well since a majority of people, up until recently, were not able to read. Speaking and writing are easily breakable nowadays though. The Egyptian hieroglyphs could be considered a form of cryptography too as it used pictures to hide their stories. The first use of algorithms to secure a message was created by the Greeks who came up with the Spartan Scytale around 7th century B.C. Rods of different diameter were used to wrap a strip of parchment around it on which a message was written. The Caesar Cipher appeared during, you guessed it, Julius Caesar’s rule and was used for war (as was the Scytale). The Caesar Cipher, a monoalphabetic cipher, used simple substitution as a form of confusion. There was little advancement in cryptography until the Middle Ages but the Arabs did make headway in cryptoanalysis by using frequency analysis.



                In the 1500’s, Leon Battista Alberi, “The Father of Western Cryptology,” developed polyalphabetic substitution. Polyalphabetic uses multiple alphabets to hide the plaintext by allowing different ciphertext symbols to represent the same plaintext symbol. During the 16th century, Blaise de Vigènere made improvements to polyalphabetic substitution which was used until the Civil War. Around WWI codebook ciphers and the one-time pad showed up. The one-time pad was started by Gilbert Vernam and improved by Joseph Mauborgne. In the case of the one-time pad, if the key is truly random and used only once then it provides perfect secrecy. Arthur Scheribus invented the Enigma machine at the end of WWI, it was used commercially at first and then improved by the German government for use in WWII. The machine was broken by a Polish cryptologist, Marian Rejewksi, and his work was transferred over to Alan Turing and the code breakers at Bletchley Park to build Bombes which were electromechanical machines that were designed specifically to break Enigma.

More detailed history here and here.

More info on different cryptography systems here.

Sunday, November 10, 2013

File Sharing: Sharing is Caring

File sharing is what makes up the internet. Internet would not exist without it being possible to share files between applications and people. Whether you are browsing the internet, sending emails, or checking Facebook you are sharing files. The issue that comes into play when sharing files is security. For most of the files integrity is enough for sharing the files across the internet but for sensitive information the files have to have confidentiality and integrity. And if you are downloading files from third-party sources, torrents, or possibly even from Dropbox they could include viruses or malware. There are many layers of security required from both the host and the user end to make sure the files are safe and secure.
            One of the aspects of file sharing is checking the integrity of the file. When you upload or send a file someone could capture the packets and modify the file any way they desire. This is where integrity comes in and tells the parties involved whether the original file has been tampered with. The two most common methods for proving file integrity are the MD5 and SHA-1 hash functions. They compute a hash from all the packets sent but unfortunately they are not as secure as believed. The next level of security for file sharing is confidentiality. This requires files to be encrypted with a key and then sent out. The key is either a symmetric key established between the parties, a public key, or a session key if a connection was established (hopefully using a secure protocol). Then the files are encrypted with algorithms such as AES and DES. The files can have both integrity and confidentiality if executed properly.




            The last part I want to touch upon is downloading files from file sharing applications. Third-party sites and torrents are often tricky for the user because anyone could have uploaded a file with any name. Most common example I have seen is if you are looking to download a specific pdf file you may find a file with a similar name but instead of the file having a pdf extension it is an executable. One has to be very careful when the source is unknown or open to anyone.