Blog Moved

Future posts related to technology are directly published to LinkedIn
https://www.linkedin.com/today/author/prasadchitta

Tuesday, July 29, 2014

Salt, Soldiers and Salary

What is the common thing in salt, salary and soldiers? All these words have the same root 'sal' in Latin. A probable link to Sanskrit word salila (सलिल - which flows, water) is the common root for this word.

Probably, the most ancient job that paid salary could be soilder. Ancient romans used to pay soldiers with bars of salt. (sal dire means to give salt = soldier) Also the salt bars could be exchanged with other commodities so it worked just like money (coins) due to its unavailability in the past.

Later we have a very established career option of salaried employees with a monthly pay period. I have took up such an employment with the first pay period falling on October 1993 and a payroll processed towards me for 250 months (July 2014 is 250th) in one company's system or the other. A majority of those monthly salaries are from TCS.

TCS 164
Oracle 50
ISTRAC 27
IBM 9
Total 250

Ideally, by savouring on the TATA salt provided by salary from TCS, one should always be worth the salt.

More than half a million employees are working under the industrial empire formed under the TATA group under the great leadership of visionaries like JRD Tata the founder of TCS which alone employes 3 lakhs globally.

A quote from JRD Tata - “Strive for perfection and you will reach excellence”

Remembering a great visionary on his 110th birth anniversary.

Saturday, May 31, 2014

Data and Analytics - some thoughts on consulting

On this technical blog, I have not been very regular now-a-days primarily due to the other writing engagements on artha SAstra and on Medium.

Yesterday, I have been asked to address a group of analytic enthusiasts in an interactive session arranged by NMIMS at Bangalore on a theme "Visualize to Stratagise". Having spent around three hours several topics on Analytics and specifically on Visual Analytics were discussed.

I thought of writing on two aspects of Analytics which I have seen in past few months on this post to give a little food for thought to those who are consulting on Analytics.

Let "data" speak.
Few weeks back one of my customers had a complaint on database. Customer said, we have allocated a large amount of storage to the database and in one month time all the space was consumed. As per the customer's IT department at the maximum of 30K business transactions were only performed by a user group of 50 on the application which is supported by this database. So, they have concluded there is something wrong on the database and hence an escalation to me to look into it.

I have suspected some interfacing schema that could be storing the CLOB/BLOB type data and there could be missing cleanup and asked my DBA to give me a tablespace growth trend. The growth is on the transaction schema and across multiple transaction tables in that schema. I have ruled out some abnormal allocation on a single object with this observation.

We thought of running a simple analytics on the transaction data to see the created user on those transactions to verify if someone has run any migration script that could have got a huge amount of data into transaction tables or some other human error.

For our surprise we have seen 1100 active users who have created 600,000+ transactions in the database. All through the different times and most regular working day, working hour pattern. No nightly batch or migration user created the data. We went ahead with a detailed analytics on the data which has mapped all the users across geography of the country of operation.

We created a simple drill down visualization of the data and submitted to business and IT groups at the customer with a conclusion that the data indeed valid and created by their users and there is no problem with the system.

So, the data spoke for itself and the customer's business team said to the IT team that they have started using the system across the country for the last month and all the users were updating transactions on this system. This fact the IT team was not aware of. IT team is still thinking it is running pilot mode with one location and 50 users.

Let the data speak. Let it show itself to those who need it for decision making.Democratize the data.

The second point which came up evidently yesterday was

"If you torture your data long enough, it will confess anything"
Do not try to prove the known hypothesis with the help of data. It is not the purpose of analytics. With data and statistics you can possibly infer anything. Any bias towards a specific result will defeat the purpose of analytics.

So, let the data with its modern visualization ability be an unbiased representative which shows the recorded history of the business with all its deficiencies, with all its recording errors and all possible quality problems; in the process of decision making and strategising..... 

Hope I made clear my two points while consulting on Analytics....

Friday, April 4, 2014

Data streams, lakes, oceans and “Finding Nemo” in them….

This weekend, I complete 3 years of TCS second innings. Most of the three years I have been working with large insurance providers trying to figure out the ways to add value with the technology to their operations and strategy.

The concurrent period has been a period of re-imagination. Companies and individuals (consumers / employees) slowly moving towards reimagining themselves in the wake of converging digital forces like cloud computing, analytics & big data, social networking and mobile computing.

Focus of my career in Information Technology has always been “Information” and not technology. I am a firm believer in “Information” led transformation rather than “technology” led transformation. The basis for information is data and the ability to process and interpret the data, making it applicable and relevant for the operational or strategic issues being addressed by the corporate business leaders.

Technologists are busy making claims that their own technology is best suited for the current data processing needs. Storage vendors are finding business in providing the storage in cloud. Mobility providers are betting big on wearable devices making computing more and more pervasive. The big industrial manufacturers are busy fusing sensors everywhere and connecting them on the internet following the trend set by the human social networking sites. A new breed of scientists calling themselves data scientists are inventing algorithms to quickly derive insights from the data that is being collected. Each one of them is pushing themselves to the front taking support of the others to market themselves.

In the rush, there is a distinctive trend in the business houses. The CTO projecting technology as a business growth driver and taking a dominant role is common. The data flows should be plumbed across the IT landscape across various technologies causing a lot of hurried and fast changing plumbing issues.

In my view the data flow should be natural just like streams of water. Information should be flowing naturally in the landscape and technology should be used to make the flow gentle avoiding the floods and tsunamis. Creating data pools in the cloud storage and connecting the pools to form a knowledge ecosystem grow the right insights relevant to the business context remains the big challenge today.

The information architecture in the big data and analytics arena is just like dealing with big rivers and having right reservoirs and connecting them to get best benefit in the landscape. And a CIO is still needed and responsible for this in the corporate.

If data becomes an ocean and insights become an effort like “Finding Nemo” the overall objective may be lost. Cautiously avoiding the data ocean let us keep the (big) data in its pools and lakes as usable information while reimagining data in the current world of re-imagination. This applies to both corporate business houses as well as individuals.

Hoping Innovative reimagination in the digital world helps improve the life in the ecosystems of the real world….

Friday, January 31, 2014

Cloud Architecture Security & Reliability

Yesterday, I was doing a presentation at SSIT, Tumkur on Cloud Architecture Security & Reliability to the faculty members of SSIT and SIT Tumkur.

With the advent of Cloud Computing paradigm there are at least five categories of "Actors" emerged.
1. Cloud Consumers, 2. Cloud Providers, 3. Cloud Brokers, 4. Cloud Auditors, 5. Cloud Carriers. The NIST conceptual reference model gives a nice overview of these. ( http://www.nist.gov/itl/cloud/upload/NIST_SP-500-291_Version-2_2013_June18_FINAL.pdf )


Image description not specified.

The security of more specifically "Information Security" is a cross cutting concern across all these actors. The CSA publishes top threats regularly here. The top threats 2013 are
  1. Data Breaches
  2. Data Loss
  3. Account Hijacking
  4. Insecure APIs
  5. Denial of Service
  6. Malicious Insiders
  7. Abuse of Cloud Services
  8. Insufficient Due Diligence
  9. Shared Technology Issues

All these threats translate to protecting four major areas of Cloud Architecture...

  1. Application Access - Authentication and Authorization
  2. Separation of Concerns - Privileged user access to sensitive data
  3. Key – Management - of encryption keys
  4. Data at Rest - Secure management of copies of data
Interestingly the ENISA threat landscape also points to similar emerging threats related to Cloud Computing -

Image description not specified.

Is there any shortcut to achieve security to any of the actors in the Cloud? I do not think so. The perspective presented by Booz & Co on cloud security has a nice ICT Resilience life clycle that was discussed.

Finally, there was a good discussion on the Reliability and Redundancy. The key aspect was how do we achieve better reliability of a complex IT system consisting of multiple components across multiple layers (i.e., web, application, database) to make best utility of non failing components to share the load while isolating the failure component and decoupling it from the cluster and seamlessly re-balancing the workload to the rest of the working components.

Overall it was a good session to interact with academia!

The slide deck that was used:

Friday, January 10, 2014

Social Analytics for Online Communities

This is the first post of 2014. Happy new year to one and all...........

A recent discussion on knome (TCS' internal social platform) related to managing online communities, controlling spam, making the best out of an enterprise social platform of the scale of ~200K members made me study the application of Social Analytics to achieve these objectives.

As I research on the internet, came across this paper - http://vmwebsrv01.deri.ie/sites/default/files/publ... titled "Scalable Social Analytics for Online Communities" by Marcel Karnstedt, Digital Enterprise Research Institute (DERI), National University of Ireland, Galway Email: marcel.karnstedt@deri.org

This post is to summarize the contents of the paper and some of my thoughts around it.

Success of a social platform depends on strength of analytics understanding and driving the dynamics of the network built by the platform.

To achieve these goals we need to have a set of tools that can perform multidimensional analysis of the structure, behavioural, content/semantic and cross community analysis.

Structural Analysis: Analyse all the communities, memberships, sub-communities based on strong relations between the members, influencers/leaders and followers.

Behavioural Analysis: Analyse the interactions to identify the helpful experts (or sub-groups) who provide information and newbies who are seeking information that are benefited by the interactions. Both a micro-level or individual level and a macro-level analysis is needed.

Content / Semantic analysis: Use text mining to detect, track and quantitatively measure current interest and shift in interest in topic and sentiment within the community.

Cross community dynamics: Understand how the community structure and sub structures are influencing each other to detect redundancies and complementary to merge and link them together.

There is a need to sufficiently combine all the analysis from all four dimensions in a scalable real-time model to achieve best understanding, control and utility of socially generated data. (rather knowledge!)

New solutions for new problems! Have a nice weekend...........