Candidacy Lecture: Introduction to Reoptimization

 
Unless P=NP, we cannot obtain a polynomial-time algorithm solving hard combinatorial problems. One practical approach in solving this kind of problem is to relax the condition of always finding the optimal solution for an instance and settle for “good enough” solutions. The kind of algorithms which are guaranteed to obtain a solution with a certain quality are called approximative algorithms. However, not all hard problems are approximable, i.e., we can obtain a polynomial-time algorithm that can guarantee the goodness of the solution for a problem.
 
In this lecture, we will present the concept of reoptimization. In this approach, given an instance I of some problem Π, an optimal solution OPT for Π in I, and a modified instance I’ resulting from a local perturbation of I, we wish to use OPT in order to solve Π in I’. With this additional information, reoptimization may help to improve the approximability of the problem or the running time of the solution to it. In fact, we can obtain a polynomial-time approximation scheme (PTAS) for a reoptimization variant of a problem given that the unmodified problem is approximable.

 

Randomized Computation


Randomized Computation
 
from Jhoirene Clemente

 

Lecture on Randomized Computation for our Computational Complexity class.

This lecture includes a short introduction to randomized algorithms.  It includes discussion on probabilistic Turing Machines (PTMs), TMs that make use of random number in its computation, as well as different complexity classes that it recognize. The complexity classes include the Bounded Error Probability Polynomial (BPP), RP, co-RP, and ZPP. Relationship of the said classes where also presented as well as their relationship to known complexity class such as P and NP.

 

Motif Finding on GPUs using Random Projections

CLEMENTE, JHOIRENE. FINDING MOTIFS IN PARALLEL USING RANDOM PROJECTION ON GPUS (Under the direction of HENRY N. ADORNA, Ph.D.) Biological motifs are short patterns that have significant number of occurrences in the set of DNA sequences. These motifs are transcription binding sites that help regulate transcription and therefore gene expression. Detection of these patterns helps in gene function discovery and building regulatory networks. Mutations may occur at random positions of the genome and these patterns are also subject to modifications, making the problem more challenging. A variant called planted (l, d)-motif finding models the detection of these subtle motif patterns in the DNA. However, several algorithms used fail to recover most of the planted (l,d)-motifs. To address this problem, a hybrid algorithm was proposed in the literature which we will refer to as FMURP (Finding Motifs using Random Projection). It uses an initialization method called Projection to avoid being trapped in the local maxima and therefore increases the chance of getting the planted motifs. This algorithm is shown to have a high accuracy on solving motif finding and planted (l,d)-motif finding problem. This research presents a parallel algorithm and implementation of FMURP on Graphics Processing Units(GPUs) using CUDA. It also provides details on the implementation and optimizations done in GPU in order to minimize usage of space. The implementation called CUDA-FMURP was tested on randomly generated (l,d)-motif instances and is shown to have recovered majority of the planted motifs. It is also shown that CUDA-FMURP obtains a maximum speedup of 6.8 using a 512 core GPU with 2.0 compute capability.

Review on the paper entitled “A Fast File System for UNIX”

This is the review for the paper entitled “A Fast File System for Unix” by Marshall Kirk McKusick et. al. of the Computer Systems Research Group of University of California, Berkeley, rewritten in 1984.

 

The paper discussed about their reimplementation of UNIX file system where they adapt the system to a wide range of peripheral and processor characteristics. The new implementation of the system is tested to have ten times faster file access rate compared to the traditional UNIX file system. Several improvements on the file system were also discussed such as advisory locks on files, filename extension across file systems, ability to use long file names, and administrative control of resource usage.

 

The two major contributions  of this paper are the modifications in the file system organization.

 

  1. The first modification is on the storage utilization. The study optimized the storage utilization by increasing the block size. Through this bigger file can be transferred in a single disk transaction, thereby greatly increasing the throughput. However, Unix file system is composed of many small files, therefore large block size increases the space wasted. To resolve this issue. A single block is further partitioned into one or more addressable fragment. Since these fragments are addressable, multiple small files can reside on one data block.

 

  1. The second modification is on the file system reparameterization. This modification is essential to the perform an optimal configuration-dependent block allocations. Each file system used are parameterized and adapted to the type of disk where it is placed. The parameters used are the speed of the processor, the hardware support for mass storage transfers, and the characteristics of the mass storage devices.

 

Although this paper significantly improved the data transfer, the time to read and write the file is almost similar to the reading rate of the older file system. The writing rate of the new file system is 50% slower than the older file system, because the kernel has to do twice as many disk allocations per second.

 

INTRODUCTION TO COMPUTATIONAL INTELLIGENCE AND EVOLUTIONARY MULTIOBJECTIVE OPTIMIZATION

You are invited to a Special Lecture on

 

INTRODUCTION TO COMPUTATIONAL INTELLIGENCE AND

EVOLUTIONARY MULTIOBJECTIVE OPTIMIZATION

 

by

 

Gary G. Yen, Fellow IEEE, Fellow IET

Oklahoma State University

School of Electrical and Computer Engineering

 

November 23, 2012 (Friday), 2:00 – 3:00 pm

Accenture Ideas Room, Department of Computer Science

Alumni Engineers Centennial Hall, UP Diliman

 

Abstract

Evolutionary computation is the study of biologically motivated computational paradigms which exert novel ideas and inspiration from natural evolution and adaptation.  The applications of population-based heuristics in solving multiobjective optimization problems have been receiving a growing attention.  To search for a family of Pareto optimal solutions based on nature-inspiring problem solving paradigms, Evolutionary Multiobjective Optimization Algorithms (EMOs) have been successfully exploited to solve optimization problems in which the fitness measures and even constraints are uncertain and changed over time.

In this talk, I will overview the state of the art in the areas of Computational Intelligence and Evolutionary Multiobjective Optimization, aiming at the graduate students and researchers without much exposure to these growing technical fields.

 

About the Speaker

Gary G. Yen received the Ph.D. degree in electrical and computer engineering from the University of Notre Dame in 1992.  He is currently a Professor in the School of Electrical and Computer Engineering, Oklahoma State University.  His research interest includes intelligent control, computational intelligence, evolutionary multiobjective optimization, conditional health monitoring, signal processing and their industrial/defense applications.

Gary was an associate editor of the IEEE Transactions on Neural Networks and IEEE Control Systems Magazine during 1994-1999, and of the IEEE Transactions on Control Systems Technology, IEEE Transactions on Systems, Man and Cybernetics and IFAC Journal on Automatica and Mechatronics during 2000-2010.  He is currently serving as an associate editor for the IEEE Transactions on Evolutionary Computation.  Gary served as Vice President for the Technical Activities, IEEE Computational Intelligence Society in 2004-2005 and is the founding editor-in-chief of the IEEE Computational Intelligence Magazine, 2006-2009.  He was the President of the IEEE Computational Intelligence Society in 2010-2011 and is elected to be a Distinguished Lecturer for the term 2012-2014.  He received Regents Distinguished Research Award from OSU in 2009, 2011 Andrew P Sage Best Transactions Paper award from IEEE Systems, Man and Cybernetics Society, and 2013 Meritorious Service award from IEEE Computational Intelligence Society.  He is a Fellow of IEEE and IET.

 

 

A proof that the Halting Problem is undecidable

Geoffrey K. Pullum
(School of Philosophy, Psychology and Language Sciences, University of Edinburgh)

No general procedure for bug checks will do.
Now, I won’t just assert that, I’ll prove it to you.
I will prove that although you might work till you drop,
you cannot tell if computation will stop.

For imagine we have a procedure called P
that for specified input permits you to see
whether specified source code, with all of its faults,
defines a routine that eventually halts.

You feed in your program, with suitable data,
and P gets to work, and a little while later
(in finite compute time) correctly infers
whether infinite looping behavior occurs.

If there will be no looping, then P prints out ‘Good.’
That means work on this input will halt, as it should.
But if it detects an unstoppable loop,
then P reports ‘Bad!’ — which means you’re in the soup.

Well, the truth is that P cannot possibly be,
because if you wrote it and gave it to me,
I could use it to set up a logical bind
that would shatter your reason and scramble your mind.

Here’s the trick that I’ll use — and it’s simple to do.
I’ll define a procedure, which I will call Q,
that will use P’s predictions of halting success
to stir up a terrible logical mess.

For a specified program, say A, one supplies,
the first step of this program called Q I devise
is to find out from P what’s the right thing to say
of the looping behavior of A run on A.

If P’s answer is ‘Bad!’, Q will suddenly stop.
But otherwise, Q will go back to the top,
and start off again, looping endlessly back,
till the universe dies and turns frozen and black.

And this program called Q wouldn’t stay on the shelf;
I would ask it to forecast its run on itself.
When it reads its own source code, just what will it do?
What’s the looping behavior of Q run on Q?

If P warns of infinite loops, Q will quit;
yet P is supposed to speak truly of it!
And if Q’s going to quit, then P should say ‘Good.’
Which makes Q start to loop! (P denied that it would.)

No matter how P might perform, Q will scoop it:
Q uses P’s output to make P look stupid.
Whatever P says, it cannot predict Q:
P is right when it’s wrong, and is false when it’s true!

I’ve created a paradox, neat as can be —
and simply by using your putative P.
When you posited P you stepped into a snare;
Your assumption has led you right into my lair.

So where can this argument possibly go?
I don’t have to tell you; I’m sure you must know.
A reductio: There cannot possibly be
a procedure that acts like the mythical P.

You can never find general mechanical means
for predicting the acts of computing machines;
it’s something that cannot be done. So we users
must find our own bugs. Our computers are losers!

On Interplanetary Internet (IPN)

During the ACM Turing award, one of the future trends presented by Cerf and Khan is the extension the terrestrial Internet for interplanetary communication. Here’s a brief discussion of the Intreplanetary Internet.

For 40 years NASA has used point to point radio link to send data from the surface of Mars directly to the Deep space stations located in three locations around the globe. Vint Cerf together with some of his colleagues in the Jet Propulsion Lab (JPL) decided to improve the communication between spacecrafts. They decided to use a richer networking architecture than a point to point radio link. Vint Cerf thought that since TCP/IP seems to work well in the terrestrial Internet why don’t they extend and use the same architecture on Mars.  The plan is also an advantage since Mars’ atmosphere is a low delay environment.

The problems arise from Interplanetary communication. The first problem is that the speed of light is too slow. To be able to communicate between Mars and Earth a total of 7 to 40 minutes round trip time is needed, and TCP/IP doesn’t work well with long time delays. Another problem is the celestial motion. Objects in space are constantly moving. Also, transmission of data can be disrupted by other spacecrafts. So to be able to solve the said issues, they decided to create a network architecture that deals with the long delay and disruption between communicating spacecrafts, which they call Delay/Disruption Tolerant Network (DTN). So they implemented and test the model here on Earth.

In January 2004, several rovers landed on Mars, and they are programmed to send data directly back to Earth (using point to point radio link). With the expected speed of 28 kbps, which is very slow. Then, when they turned the radio on, these rovers overheated. They realized that there exists X-band radio on board and in the orbiters of Mars. These orbiters were previously used  to map the surface of Mars and since that project is already finished, they reprogrammed the orbiters and the overheated rovers. The data from the rovers were collected by the orbiters. The orbiters stored the data and when it gets to the right place, it transmits the data to the deep space on Earth. So they used a store and forward technique which is used in terrestrial mobile networks.

When Phoenix Lander landed on Mars, it is not programmed to directly send the data back to Earth, thus it uses the same store and forward technique which is used by the previously overheated networks. So they planned to standardize the store and forward approach for interplanetary communication.

The solution for this is to use the Delay and Disruption Tolerant Network with Bundle protocol for communicating spacecrafts. The details for the DTN and Bundle Protocol is presented in detailed in [1,2,3]. Another transport protocol which is named Saratoga is proposed in [4] to run on top of User Datagram Protocol (UDP). The Saratoga transport protocol is used by the satellites  to communicate with the ground stations on Earth.

References

[1]  Vinton Cerf, et. al., Interplanetary Internet (IPN): Architectural Design, Jet Propulsion Laboratory, The MITRE Corporation, 2001

[2] Kevin Fall. A delay-tolerant network architecture for challenging internets. In Proceedings of the 2003 conference on Applications, technologies, architectures, and protocols for computer communications (SIGCOMM ’03). ACM, New York, NY, USA, 27-34. DOI=10.1145/863955.863960 http://doi.acm.org/10.1145/863955.863960, 2003

[3] Kevin Fall, Wei Hong, Samuel Madden, Custody Transfer for Reliable Delivery in Delay Tolerant Networks, RT Journal Article, ID 1966918, 2003

[4] Lloyd Wood , Wesley M. Eddy , Will Ivancic , Jim Mckim , Chris Jackson Saratoga: A Delay-Tolerant Networking convergence layer with efficient link utilization, Third International Workshop on Satellite and Space Communications (IWSSC ’07), 2007