Ever wondered how booking railway or flight tickets happens in a jiffy? Behind making these transactions effortless and easy are complex database systems at work, painstakingly designed with thousands of lines of programming code. Prof Jayant Haritsa chairs the Department of Computer Science and Automation in Indian Institute of Science (IISc), Bangalore and is a professor of Database Systems in Supercomputer Education and Research Centre there. He won the Infosys Prize for Engineering and Computer Science last year, for his contributions to the design and optimisation of database engines that form the core of information systems. Haritsa spoke to edex on the need for more students to take up research in the field. Excerpts from the chat.
How do database systems work?
Think of a situation where you want to travel from Bangalore to Delhi. There are numerous different routes one could take for this journey, and your goal is to identify the ideal choice, taking into account factors like time, distance and fare. Users have to merely declare their final objective, and the system identifies the fastest way to achieve this. It is computationally very difficult, as thousands of different strategies have to be laboriously computed, and each one of them needs to be evaluated to find the best solution.
Are enough students taking to research in database optimisation? What is the situation at IISc?
We don’t have enough students working on the nuts and bolts of database systems. Our computer science curriculum and our software industry thrive entirely on the applications and services part of it. It is easy to take up well-known software packages like Linux or DB2 and utilise them, just as it is easy to drive a car but not design the car’s mechanics. Researching and designing core data processing engines is time consuming and requires a lot of effort. There are no low hanging fruits there.
Most students take to the application side of Computer Science as there are no entry barriers. A good background in Mathematics, data structures and algorithms is enough to contribute to these areas. You will find MNCs, literally on every street, which are willing to give you a job on application development. There are very few companies like IBM, HP and Microsoft who take a bottom-up approach.
Barring a few, the vast multitude of engineering institutes pay only cursory attention to aspects of database engines. Research in database systems requires grounding in database systems, computer architecture and operating systems, an in-depth understanding of complex historical concepts and physical implementations. Prior research literature needs to be thoroughly understood before working on these topics. Fortunately, at IISc, we have been able to build a strong group of graduate students who have made fundamental contributions to the design of database engines.
How do we draw students towards taking up research in database engines?
Creating awareness about the success stories in the field is one way. The AADHAR scheme is the first on this list. Here, the system checks the eye scan and other biometrics and prevents duplication of data so that citizens don’t get double benefits in beneficiary schemes of the government. All this isn’t possible without core database engines at work. Real-time databases are at work when you are, for example, trying to sell or buy stocks. The price is changing as the transaction happens and hence the action is time bound. You also have biological databases that help in accurate medical prognosis.
What are the challenges specific to the Indian database community?
Diversity of data types, ensuring data security and privacy, and non-volatile memory are some of the challenges to the worldwide community. Currently, you have the hard disk which is the permanent memory, and the main memory or RAM, which is temporary. The moment you switch off the computer, the data is lost. There are a few MNCs and groups trying to change this, to make memory non-volatile. Lack of data compliance standards and presence of data in different languages and formats is a major challenge in India.
What are the opportunities available to those wanting to pursue a career in this field?
In recent years, several MNCs have started creating development groups in India to work on engine-related issues, and they are all clamouring to hire students with the requisite skill-sets. So, rather than just being another cog in the wheel, students who are willing to get their hands dirty will find themselves much in demand.
How can research in database systems offer solutions to the problem of language that comes in the way of connectivity at times?
If you keyed in, let’s say,p names of students asking for their addresses in a particular database, you will get the results faster in English than in Kannada or Hindi. There is a bias among commercial software that give preferential treatment for data stored in European languages. We are trying to make it a level playing field, fair and democratic. The problem is it is not a pressing commercial need, yet. This requires the development of novel storage architecture, design of new query operators and a supporting algebra for efficient syntactic and semantic matching across languages.