Saturday, 27 May 2017

18 Big Data tools you need to know!!

In today’s digital transformation, big data has given organization an edge to analyze the customer behavior & hyper-personalize every interaction which results into cross-sell, improved customer experience and obviously more revenues.
The market for Big Data has grown up steadily as more and more enterprises have implemented a data-driven strategy. While Apache Hadoop is the most well-established tool for analyzing big data, there are thousands of big data tools out there. All of them promising to save you time, money and help you uncover never-before-seen business insights.
I have selected few to get you going….
Avro: It was developed by Doug Cutting & used for data serialization for encoding the schema of Hadoop files.

Cassandra: is a distributed and Open Source database. Designed to handle large amounts of distributed data across commodity servers while providing a highly available service. It is a NoSQL solution that was initially developed by Facebook. It is used by many organizations like Netflix, Cisco, Twitter.

Drill: An open source distributed system for performing interactive analysis on large-scale datasets. It is similar to Google’s Dremel, and is managed by Apache.

Elasticsearch: An open source search engine built on Apache Lucene. It is developed on Java, can power extremely fast searches that support your data discovery applications.

Flume: is a framework for populating Hadoop with data from web servers, application servers and mobile devices. It is the plumbing between sources and Hadoop.

HCatalog: is a centralized metadata management and sharing service for Apache Hadoop. It allows for a unified view of all data in Hadoop clusters and allows diverse tools, including Pig and Hive, to process any data elements without needing to know physically where in the cluster the data is stored.

Impala: provides fast, interactive SQL queries directly on your Apache Hadoop data stored in HDFS or HBase using the same metadata, SQL syntax (Hive SQL), ODBC driver and user interface (Hue Beeswax) as Apache Hive. This provides a familiar and unified platform for batch-oriented or real-time queries.

JSON: Many of today’s NoSQL databases store data in the JSON (JavaScript Object Notation) format that’s become popular with Web developers

Kafka: is a distributed publish-subscribe messaging system that offers a solution capable of handling all data flow activity and processing these data on a consumer website. This type of data (page views, searches, and other user actions) are a key ingredient in the current social web.

MongoDB: is a NoSQL database oriented to documents, developed under the open source concept. This comes with full index support and the flexibility to index any attribute and scale horizontally without affecting functionality.

Neo4j: is a graph database & boasts performance improvements of up to 1000x or more when in comparison with relational databases.
Oozie: is a workflow processing system that lets users define a series of jobs written in multiple languages – such as Map Reduce, Pig and Hive. It further intelligently links them to one another. Oozie allows users to specify dependancies.

Pig: is a Hadoop-based language developed by Yahoo. It is relatively easy to learn and is adept at very deep, very long data pipelines.

Storm: is a system of real-time distributed computing, open source and free.  Storm makes it easy to reliably process unstructured data flows in the field of real-time processing. Storm is fault-tolerant and works with nearly all programming languages, though typically Java is used. Descending from the Apache family, Storm is now owned by Twitter.

Tableau: is a data visualization tool with a primary focus on business intelligence. You can create maps, bar charts, scatter plots and more without the need for programming. They recently released a web connector that allows you to connect to a database or API thus giving you the ability to get live data in a visualization.

ZooKeeper: is a service that provides centralized configuration and open code name registration for large distributed systems. 

Everyday many more tools are getting added the big data technology stack and its extremely difficult to cope up with each and every tool. Select few which you can master and continue upgrading your knowledge.

Sunday, 21 May 2017

Top 7 Virtual Reality Industry use cases

Today Digital Transformation has entered our life and we have been subconsciously using it in our day to day life. e.g. Smartphones, Smart cars, internet connected devices etc.

Virtual Reality technology has evolved dramatically in the past few years the costs of VR devices has gone down so it is all set to hit mainstream markets soon. While gaming applications like Pokemon Go have attracted most of the attention, there are many other use cases that could have a much larger impact on our lives.

Google Cardboard is a super low-cost headset ($15) to which a compatible, VR enabled mobile phone is attached to deliver the VR experience.

Other commercial product is Oculus Rift gear which has become extremely popular in gaming & business equally.

Here are some great VR use cases:

1.     VR for Tourism: do you want to sit on your couch and climb up the Eiffel tower? Or walk on the glass horse shoe at grand canyon? Wild Within is VR app available for experience of travel through rain forest in Canada. Travelers around the world are able to experience a helicopter flight around New York City or a boat ride around the Statue of Liberty.

2.     VR for Education: Over last decade eLearning had picked up very much. But it could not deliver hands on experience which is now possible with VR technology. Technicians can actually learn the real life examples and do their bit to solve the problems on the shop floor. Medical students can actually perform surgeries allowing them to make mistakes without any impact on actual patients.

3.     VR for Sales: Traditionally automakers have the showroom to show the cars to the customers and explain their features and sometimes a test drive is also possible. But customization of how the interior will look as per their choice was not possible which now can be done via VR.  Audi is experimenting this in London, where customer can configure their Audi with accessories as they want and drive virtually in real time.

4.     VR in Gaming: who does not know the excitement Pokemon Go had created and reached 50 million users in record time of 22 days.  Using AR/VR technology games have changed the life of seniors as well as teens. Game of Thrones has capitalized on VR and gone viral in various countries.

5.     VR in Designing: product designing is tedious task and changes to products based on the competition or customization is time consuming. This is where VR helps designers. They can now create the products easily, configure all the features and test them out. It is more popular in construction of buildings to see how the interior will look like.

6.     VR in Marketing: With Digital Marketing ads are becoming more intrusive. The best marketing campaigns use VR to create successful campaigns as users get completely immersed into the content, and create memorable experiences. Coca Cola created a virtual reality sleigh ride. New York times releases multiple immersive documentaries in their app. Finnair is showing their Airbus 350 via VR to attract more customers.

7.     VR in Sports coaching: The potential for VR in sports in endless. You get all the benefits of real-world interaction, but in a controlled environment. Showing is so much more effective than explaining, and experiencing something first-hand is that much more powerful again. Football, Cricket.


Virtual reality technology holds enormous potential to change the future for a number of fields, from medicine, business, and architecture to manufacturing. We are on the roller coaster ride !!

Saturday, 13 May 2017

Internet of (Medical) things in Healthcare

Over the past few decades, we’ve gotten used to the Internet and cannot imagine our lives without it. Millennials and new age kids don’t even know what is life without being online.

With the disruption of Digital Transformation, Internet of Things have added lots of opportunities to business and consumers like us, equally.

IOT means connecting things, extracting data, storing, processing and analyzing in big data platforms and making decisions based on analytics. It helps in predicting certain outcomes thereby helping with taking preventive actions

The popularity of wearables, such as fitness trackers, blood glucose monitors and other connected medical devices, has taken healthcare by storm. Connected devices have become a prevalent phenomenon in the consumer space and have made their way into healthcare

Healthcare is fast adopting IoT & changing rapidly, as it reduces costs, boosts productivity, and improves quality. IoT can also boost patient engagement and satisfaction by allowing patients to spend more time interacting with their doctors.

There are a number of opportunities for the internet of things to make a difference in patients' lives. IoT-enabled devices capture and monitor relevant patient data and allow providers to gain insights without having to bring patients in for visits. Adding sensors to medicines or delivery mechanisms allows doctors to keep accurate track of whether patients are sticking to their treatment plan and avoid patient's readmission.

Patients are using these connected medical products to capture ECG readings, record medication levels, sense fall detection and act as telehealth units.

Diabetes self-management includes all sorts of gadgets and devices, which control glucose levels and remind patients to take their insulin dose. The newest wearables are even capable of delivering insulin on their own, according to health condition indicators. 

Remote patient monitoring is one of the most significant cost-reduction features of IoT in healthcare. Hospitals don’t have to worry about bed availability, and doctors or nurses can keep an eye on their patients remotely. At the same time, patients usually feel more relaxed at home and recover faster.

Smart beds are a convenient solution for patients who have trouble adjusting bed positions on their own. This kind of IoT tool can sense when the patient is trying to move on their own and it reacts by correcting the bed angle or adjusting pressure to make the person more comfortable. Additionally, this frees up nurses, who don’t have to be available all the time and can dedicate extra time to other duties. Many hospitals have already introduced smart beds in their rooms.

At Boston Medical Center, IoT is everyday life:
·       Newborn babies are given wristbands, allowing a wireless network to locate them at any time.
·       They have installed wireless sensors in refrigerators, freezers and laboratories to ensure that blood samples, medications and other materials are kept at the proper temperatures.
·       Hospital has more than 600 infusion pumps which are IoT enabled. BMC staff members can now dispense and change medications automatically through the wireless network, rather than having to physically touch each pump to load it up or make changes.

At Florida Hospital, when patients go in for surgery, they're tagged with real-time location system (RTLS) badges that track their progress through from the pre-op room to the surgical suite to the recovery unit so relatives can track the patients from outside.

Philips GoSafe can be worn as a pendant and it helps to detect and alert falls in elderly people

There are few challenges as well in implementing IoT:
·       Data security & lack of standard security policy
·       Hospital’s internal system integration with IoT data
·       Further changes and improvements in IoT hardware

The Internet of these Medical Things is a game-changer as future will be connected, integrated & secure healthcare industry 

Sunday, 7 May 2017

Terminator or Iron Man – What will AI bring in future?

In the age of Digital Transformation, Artificial Intelligence has come a long way from Siri to driverless cars.

If you have used a GPS on Google Maps to navigate in your car, purchased a book recommended to you by  Amazon or watched a movie suggested to you by Netflix, then you have interacted with artificial intelligence.

Artificial Intelligence is the capability of a machine to imitate intelligent human behavior which relies on the processing and comparison of vast amounts of data in volumes with help of big data analytics, no human being could ever absorb.

Stephen Hawking, Elon Musk, Bill Gates have recently expressed concern in the media about the risks posed by AI.

According to them, AI will soon replace all kinds of manual tasks and make humans redundant. This could be true in some sense but still this is a far cry from the current maturity levels of AI, which is still at the stage of figuring out real-world use cases.

Today machines can carry out complex actions but without a mind or thinking for themselves. Smartphones are smart because they are responding to your specific inputs.

The world’s top tech companies are in a race to build the best AI and capture that massive market, which means the technology will get better fast, and come at us at faster speed. IBM is investing billions in its Watson, Apple improving Siri, Amazon is banking on Alexa;  Google, Facebook and Microsoft are devoting their research labs to AI and robotics.

Together, they will swirl into that roaring twister, blowing down the industries and businesses in its path.

Within maybe few years, AI will be better than humans at diagnosing medical images and converting speech to emotions. But it can also be stealing millions of records from a government agency to identify targets vulnerable to extortion.

Soon you’ll be able to contact an AI doctor on your smartphone, talk to it about your symptoms, use your camera to show it anything it wants to see and get a diagnosis that tells you to either take a couple of Tylenols or see a specialist.

In all the fairy tales we have seen so far, good almost always wins over evil.
This is what we have seen in the movies like I, Robot or Avengers: Age of Ultron.  But Will Smith or team of avengers does not know that till end of the story. That’s where we are now: face to face with the demon for the first time, doing everything we can to get through the scary plot alive.

Today many companies are using AI for improving their business:
·         Geico is using Watson based cognitive computing to learn the underwriting guidelines, read the risk submissions, and effectively help underwrite
·         Google Translate applies AI in not only translating words, but in understanding the meaning of sentences to provide a true translation.
·         IBM Watson is the most prominent example of AI based question answering via petabytes of data retrieval that helps in various areas like finance, healthcare & insurance.

As Humans we are programmed from childhood either by nurture or nature to do things the way we do. All the nine emotions we have learned since then are the inseparable part of our lives.

Currently we are in the control of the planet because we are smartest species compared to all the animals.

But when, and if machines learns to love or hate, work in peace or retaliate in anger, then it’s not too far that, with the ability to consume & digest the vast amount of data, they will become more smarter & start taking control of the planet.

Only then we will be able to know that AI is helping us like Iron Man's Jarvis or planning to eradicate us like Terminator!!

Saturday, 29 April 2017

5 ways to improve the model accuracy of Machine Learning!!

Today we are into digital age, every business is using big data and machine learning to effectively target users with messaging in a language they really understand and push offers, deals and ads that appeal to them across a range of channels.

With exponential growth in data from people and & internet of things, a key to survival is to use machine learning & make that data more meaningful, more relevant to enrich customer experience.

Machine Learning can also wreak havoc on a business if improperly implemented. Before embracing this technology, enterprises should be aware of the ways machine learning can fall flat. Data scientists have to take extreme care while developing these machine learning models so that it generate right insights to be consumed by business.

Here are 5 ways to improve the accuracy & predictive ability of machine learning model and ensure it produces better results.

·       Ensure that you have variety of data that covers almost all the scenarios and not biased to any situation. There was a news in early pokemon go days that it was showing only white neighborhoods. It’s because the creators of the algorithms failed to provide a diverse training set, and didn't spend time in these neighborhoods. Instead of working on a limited data, ask for more data. That will improve the accuracy of the model.

·       Several times the data received has missing values. Data scientists have to treat outliers and missing values properly to increase the accuracy. There are multiple methods to do that – impute mean, median or mode values in case of continuous variables and for categorical variables use a class. For outliers either delete them or perform some transformations.

·       Finding the right variables or features which will have maximum impact on the outcome is one of the key aspect. This will come from better domain knowledge, visualizations. It’s imperative to consider as many relevant variables and potential outcomes as possible prior to deploying a machine learning algorithm.

·       Ensemble models is combining multiple models to improve the accuracy using bagging, boosting. This ensembling can improve the predictive performance more than any single model. Random forests are used many times for ensembling.

·       Re-validate the model at proper time frequency. It is necessary to score the model with new data every day, every week or month based on changes in the data. If required rebuild the models periodically with different techniques to challenge the model present in the production.

There are some more ways but the ones mentioned above are foundational steps to ensure model accuracy.

Machine learning gives the super power in the hands of organization but as mentioned in the Spider Man movie – “With great power comes the great responsibility” so use it properly.


Saturday, 22 April 2017

Beyond SMAC – Digital twister of disruption!!

Have your seen the 1996 movie Twister, based on tornadoes disrupting the neighborhoods? A group of people were shown trying to perfect the devices called Dorothy which has hundreds of sensors to be released in the center of twister so proper data can be collected to create a more advanced warning system and save people.

Today if we apply the same analogy – digital is disrupting every business, if you stand still and don’t adapt you will become digital dinosaur. Everyone wants to get that advance warning of what is coming ahead.

Even if your business is doing strong right now, you will never know who will disrupt you tomorrow.

We have seen these disruption waves and innovations in technologies – mainframe era, mini computers era, personal computers & client-server era and internet era. Then came the 5th wave of SMAC era comprising Social, 
Mobile, Analytics and Cloud technologies.

Gone are the days when we used to wait for vacations to meet our families and friends by travelling to native place or abroad. Today all of us are interacting with each other on social media rather than in person on Facebook, Whastapp, Instagram, Snapchat and so on.

Mobile enablement has helped us anytime, anywhere, any device interaction with consumers. We stare at smarphone screen more than 200 times a day.

Analytics came in to power the hyper-personalization in each interaction and send relevant offers, communications to customers. The descriptive analytics gave the power to know what is happening to the business right now, while predictive analytics gave the insight of what may happen. Going further prescriptive analytics gave the foresight of what actions to be taken to make things happens.

Cloud gave organizations the ability to quickly scale up at lower cost as the computing requirements grow with secure private clouds.

Today we are in the 6th wave of disruption beyond SMAC era - into Digital Transformation, bringing Big Data, Internet of things, APIs, Microservices, Robotics, 3d printing, augmented reality/virtual reality, wearables, drones, beacons and blockchain.

Big Data allows to store all the tons of data generated in the universe to be used further for competitive edge.

Internet of Things allows machines, computers, smart devices communicate with each other and help us carry out various tasks remotely.

APIs are getting lot of attention as they are easy, lightweight, can be plugged into virtually any system and highly customizable to ensure data flows between disparate systems.

Microservices are independently developed & deployable, small, modular services. 

Robotics is bringing the wave of intelligent automation with help of cognitive computing.

3D printing or additive manufacturing is taking the several industries like medical, military, engineering & manufacturing by storm.

Augmented reality / virtual reality is changing the travel, real estate and education.

Wearables such as smart watches, health trackers, Google Glass can help real time updates,  ensure better health & enable hands-free process optimization in areas like item picking in a warehouse.

Drones have come out of military zone and available for common use. Amazon, Dominos are using it for delivery while Insurance & Agriculture are using it for aerial surveys.

Beacons are revolutionizing the customer experience with in-store analytics, proximity marketing, indoor navigation and contact less payments.

The new kid on the block is blockchain where finance industry is all set to take advantage of this technology.

As products and services are getting more digitized, traditional business processes, business models and even business are getting disrupted.

The only way to survive this twister is to get closer to your customers by offering a radically different way of doing business that’s faster, simpler and cheaper.

Saturday, 15 April 2017

A to Z of Analytics

Analytics has taken world by storm & It it the powerhouse for all the digital transformation happening in every industry.

Today everybody is generating tons of data – we as consumers leaving digital footprints on social media, IoT generating millions of records from sensors, Mobile phones are used from morning till we sleep. All these variety of data formats are stored in Big Data platform. But only storing this data is not going to take us anywhere unless analytics is applied on it. Hence it is extremely important to close the loop with Analytics insights.

Here is my version of A to Z for Analytics:

Artificial Intelligence: AI is the capability of a machine to imitate intelligent human behavior. BMW, Tesla, Google are using AI for self-driving cars. AI should be used to solve real world tough problems like climate modeling to disease analysis and betterment of humanity.

Boosting and Bagging: it is the technique used to generate more accurate models by ensembling multiple models together

Crisp-DM: is the cross industry standard process for data mining.  It was developed by a consortium of companies like SPSS, Teradata, Daimler and NCR Corporation in 1997 to bring the order in developing analytics models. Major 6 steps involved are business understanding, data understanding, data preparation, modeling, evaluation and deployment.

Data preparation: In analytics deployments more than 60% time is spent on data preparation. As a normal rule is garbage in garbage out. Hence it is important to cleanse and normalize the data and make it available for consumption by model.

Ensembling: is the technique of combining two or more algorithms to get more robust predictions. It is like combining all the marks we obtain in exams to arrive at final overall score. Random Forest is one such example combining multiple decision trees.

Feature selection: Simply put this means selecting only those feature or variables from the data which really makes sense and remove non relevant variables. This uplifts the model accuracy.

Gini Coefficient: it is used to measure the predictive power of the model typically used in credit scoring tools to find out who will repay and who will default on a loan.

Histogram: This is a graphical representation of the distribution of a set of numeric data, usually a vertical bar graph used for exploratory analytics and data preparation step.

Independent Variable: is the variable that is changed or controlled in a scientific experiment to test the effects on the dependent variable like effect of increasing the price on Sales.

Jubatus: This is online Machine Learning Library covering Classification, Regression, Recommendation (Nearest Neighbor Search), Graph Mining, Anomaly Detection, Clustering

KNN: K nearest neighbor algorithm in Machine Learning used for classification problems based on distance or similarity between data points.

Lift Chart: These are widely used in campaign targeting problems, to determine which decile can we target customers for a specific campaign. Also, it tells you how much response you can expect from the new target base.

Model: There are more than 50+ modeling techniques like regressions, decision trees, SVM, GLM, Neural networks etc present in any technology platform like SAS Enterprise miner, IBM SPSS or R. They are broadly categorized under supervised and unsupervised methods into classification, clustering, association rules.

Neural Networks: These are typically organized in layers made up by nodes and mimic the learning like brain does. Today Deep Learning is emerging field based on deep neural networks.
 
Optimization: It the Use of simulations techniques to identify scenarios which will produce best results within available constraints e.g. Sale price optimization, identifying optimal Inventory for maximum fulfillment & avoid stock outs

PMML: this is xml base file format developed by data mining group to transfer models between various technology platforms and it stands for predictive model markup language.

Quartile: It is dividing the sorted output of model into 4 groups for further action.

R: Today every university and even corporates are using R for statistical model building. It is freely available and there are licensed versions like Microsoft R. more than 7000 packages are now available at disposal to data scientists.

Sentiment Analytics: Is the process of determining whether an information or service provided by business leads to positive, negative or neutral human feelings or opinions. All the consumer product companies are measuring the sentiments 24/7 and adjusting there marketing strategies.

Text Analytics: It is used to discover & extract meaningful patterns and relationships from the text collection from social media site such as Facebook, Twitter, Linked-in, Blogs, Call center scripts.

Unsupervised Learning: These are algorithms where there is only input data and expected to find some patterns. Clustering & Association algorithms like k-menas & apriori are best examples.

Visualization: It is the method of enhanced exploratory data analysis & showing output of modeling results with highly interactive statistical graphics. Any model output has to be presented to senior management in most compelling way. Tableau, Qlikview, Spotfire are leading visualization tools.

What-If analysis: It is the method to simulate various business scenarios questions like what if we increased our marketing budget by 20%, what will be impact on sales? Monte Carlo simulation is very popular.

What do think should come for X, Y, Z?

Saturday, 8 April 2017

From Bullock Cart to Hyperloop – Digital Transformation of Travel

Remember when you were teenager and wanted to go on vacation with parents-you were asked to go to travel agent and get all the printed brochures of exotic locations?  

Then came the dot.com wave and online booking sites like Expedia, Travelocity, Makemytrip paved so much that took travel agencies out of equation.

We used to send holiday postcards to our friends and families back home, which are gone out of business due to social media postings on Facebook, Instagram.

Lonely Planet used to be the traveler’s bible, but now we go to tons of websites like TripAdvisor, Priceline which provide us with advice and reviews on hotels, tours and restaurants.

Now I am able to book my flight online, have my boarding pass on my phone, check in with machines, go through automated clearance gates and even validate my boarding pass to board the plane

The travel industry, like many others, is being disrupted by great ideas powered by digital technology and innovation.

Some of the digital innovations travel industry taken so far:
·     Online booking sites like Expedia, Travelocity, MakeMyTrip, Trivago
·     Mobile optimization with Wi-Fi enablement
·     Targeting and hyper-personalization with Big Data Analytics
·     Digital discounts on travel by Kayak, Tripadvisor
·     Smartphones for research vacations, deals, feedbacks
·     Wearables like Disney band for payments, room keys
·     Bluetooth beacons to guide travelers in the vicinity at airports
·     Virtual reality – see the places without even getting out of home

All such digital footprint of customers are collected and then analyzed by big data analytics to hyper personalized the experience.

With extensively networked digital properties and deep hooks into customer data collected via travel booking sites and social media channels, travel companies are delivering customized dream vacations according to the likes and preferences of today’s travelers.

Today’s trend is towards spending money on memories & experiences instead of material possessions.

Accordingly, travel companies are investing in their digital storefronts and omni-channels to keep today’s hyper-connected travelers snapping, sharing, researching and reviewing on the fly – leaving immense data footprints for marketers to leverage.

Bluesmart is a high-quality carry-on suitcase that you can control from your phone. From the app you can lock and unlock it, weigh it, track its location, be notified if you are leaving it behind and find out more about your travel habits.

Thomas Cook have introduced virtual reality experiences across select stores.

Digital disrupters like Airbnb have already put tremendous pressure on hotels.

Starwood Hotels have launched “Let’s chat”, enabling guests to communicate with its front desk associates via WhatsApp, Blackberry messenger or iPhone before or during their stay.

World has gone from Bullock Cart to Hyperloop today. The future will belong to those using data-based intelligence to offer better experiences, encourage exotic longer and more frequent stays, and build long-term loyalty.

LinkWithin

Related Posts Plugin for WordPress, Blogger...