Blog Moved

Future posts related to technology are directly published to LinkedIn
https://www.linkedin.com/today/author/prasadchitta

Friday, August 30, 2013

F1 Database from Google: A scalable distributed SQL database

 

This world is a sphere. We keep going round and round. After a great hype around the NoSQL highly distributed databases, now Google presented a paper on how they have implemented a SQL based highly scalable database for supporting their "AdWords" business in 39th VLDB conference.


The news item: http://www.theregister.co.uk/2013/08/30/google_f1_deepdive/
and the paper: http://static.googleusercontent.com/external_content/untrusted_dlcp/research.google.com/en/us/pubs/archive/41344.pdf


The key changes I liked:
1. Heirarchically clustered physical schema model: I always thought heighrarchical model is more suited for real life application than a pure relational model. This implementations is proving it.

2. Protocol Buffers: Columns allowing structured types. It saves a lot of ORM style conversions when moving data from a storage to in-memory and vice versa.


A quote from the paper's conclusion:
In recent years, conventional wisdom in the engineering community has been that if you need a highly scalable, high-throughput data store, the only viable option is to use a NoSQL key/value store, and to work around the lack of ACID transactional guarantees and the lack of conveniences like secondary indexes, SQL, and so on. When we sought a replacement for Google's MySQL data store for the Ad-Words product, that option was simply not feasible: the complexity of dealing with a non-ACID data store in every part of our business logic would be too great, and there was simply no way our business could function without SQL queries.
So, ACID is needed and SQL is essential for running the businesses! Have a nice weekend!!

Friday, August 23, 2013

Anticipatory Computing and Functional Programming – some rambling…

 

After an early morning discussion on Anticipatory Computing on TCS's enterprise social network - Knome,  I thought of making this blog post linking the aspects of “functional orientation” of complex systems with consciousness.

In the computing world, it is generally widely accepted fact that data can exist without any prescribed associated process. Once the data is stored on a medium (generally called as Memory) it can be put into any abstract process trying to derive some conclusions. (This trend is generally called as big-data analytics leading to predictive and prescriptive analytics)

But,

If I mention that function can exist without any prescribed data to it with multiple outcomes, then it is not easily accepted. Only thing people can think about is completely chaotic random number generator in this. Completely data independent, pure function that returns a function based on its own “anticipation” is what is called consciousness.

This is one of my interest areas in computability and information theory. A complex system behavior is not driven entirely by the data presented to it. Trying to model the complex system purely by the past data emitted by the system is not going to work. One should consider the anticipatory bias of the system as well while modeling.

Functional Programming comes a step near to this paradigm. It tries to define the function without intermittent state preserving variables. In mathematical terms a function maps elements of domain to its range. Abstracting this into an anticipation model we get the consciousness (or free will) as a function of three possible return functions.

1. Will do
2. Will NOT do
3. Will do differently
(I have derived this based on an ancient Sanskrit statement regarding free will – kartum, akartum, anyathA vA kartum saktaH)

The third option above (it is beyond binary, 0 or 1) leads to the recursion of this function of evaluation of alternatives and again at (t+Δt) the system has all the three options. When the anticipatory system responds then “data” starts emitting from it. The environment in with this micro anticipatory system is operating is also a macro anticipatory system.

The ongoing hype around big data is to establish the patterns of data emitted from various micro-systems and establishing the function of macro-freewill. It is easier for a micro-freewill to dynamically model the function which is called “intuition” that is beyond the limits of computability.

Enough of techno-philosophical rambling for this Friday! Have a nice weekend.

Thursday, August 8, 2013

Science, Research, Consulting and Philosophy

It was this day 25 years back (08-08-1988) I have joined my Bachelors of Science in Computer Sciences course. The aim at that time is be become a Scientist. As the years passed, I have completed my Masters and joined in Indian Space Research Organization.

Due to various reasons, I could not register for a PhD degree nor could continue my Research career. Instead, I started doing software consulting joining TCS, the largest software services company of India. That took me to various business domains starting with Banking moving into Utilities (Gas Transportation), retail, financial services and insurance. Working as a developer, tester, modeller, designer, architect, pre-sales solution support, offshore delivery manager etc., roles gave me an experience worth of PhD.

Later it was a period of working with Oracle in the core Server Technologies division when we were working closely with the select elite customers of Enterprise Manager product who were monitoring and managing large data centers.

A later period it turned out to be philosophy. Philosophy of data, information, knowledge trying to optimize the end to end information flows using the right strategies for the life cycle of information. Efficient data capture from individual transactions. Supporting the operational requirements with the needed latency, making it available in the right format for its human and other computing systems, transforming and moving around efficiently to derive much needed long term strategic decisions etc.,

Most of my career till date has moved through the highs and lows of information technology hype cycles, peaks, waves and magic quadrants.....

Links to the blog posts that are made around 8-August.....

Last year: http://technofunctionalconsulting.blogspot.in/2012/08/multi-tenancy-and-resource-management.html

Before: http://technofunctionalconsulting.blogspot.in/2011/08/web-age-of-www.html

http://technofunctionalconsulting.blogspot.in/2010/08/8035-days-or-22-years.html

http://technofunctionalconsulting.blogspot.in/2009/08/another-year.html

http://technofunctionalconsulting.blogspot.in/2008/08/quick-recap-of-20-years-8888-till.html

Friday, August 2, 2013

Crisscrossing thoughts around #Cloud and #BigData

While “Big Data Analytics” is running on Cloud based infrastructure with 1000s of (virtual) servers, Cloud infrastructure management has become a big data problem!

Assuming all key availability and performance metrics need to be collected and processed regularly to keep the cloud infrastructure running within the agreed performance service levels and to identify the trends of demand for the cloud services there is an absolute need for the predictive analytics on the collected metrics data.

As the data centers gradually turn into private clouds with a lot of virtualization, it becomes increasingly important to manage the underlying grid of resources efficiently by allocating the best possible resources to the high priority jobs. The integrated infrastructure monitoring and analytics framework running on the grid itself can optimize the resource allocation dynamically to fit the workload characteristics could make the data center more efficient and green.

Taking the same approach to the business services across the organizational boundaries, there could be an automated market place where the available computing resources could be traded by the public cloud providers and the consumers can “buy” needed computing resources in the market and get their processing executed by probably combining multiple providers’ resources on an extended hybrid cloud in a highly dynamic configuration.

The data and processing have to be encapsulated at a micro or nano scale objects, taking the computing out of current storage – processor architecture into a more connected neuron like architecture with billions of nodes connected in a really BIG bigdata.

OR

If all the computing needed on this tiny globe can be unified into a single harmonic process, the amount of data that needs moving comes to a minimum and a “single cloud” serves the purpose.

Conclusion: Cloud management using bigdata, and big data running on cloud infrastructure complement each other to improve the future of computing!

Question: If I have a $1 today, where should I invest for better future? In big data? Or in Cloud startup??

Have a fabulous Friday!