Senior Data Scientist, SmartCat
The role of business analytics in data-driven companies is pretty clear: Businesses use analytics to drive their decision-making processes and measure progress. However, business analytics tools are many and with most of them it is usually not so clear how to adjust dashboards to your complex and specific use-cases, get tailored insights or how to track custom metrics or build predictive models on top of collected data. With this post we want to share how this type of flexibility is possible after you reclaim your data and apply open-source, custom business analytics solutions.
When running a website, web store, marketplace or a social network you want to be able to learn from your customers’ clicks, searches and conversions, as well as to analyze and visualize different aspects of your business, including the effectiveness of email campaigns, customer support, sales teams, etc. Insights derived from such analysis and visualizations can be extremely valuable and help you grow and succeed.
To achieve this goal, companies invest in collecting and storing their business data, typically at different places. But even once all that data warehousing is in place, making sense of collected data can be extremely challenging and at times scary. The most interesting questions you’re answering might require streaming processing from multiple sources or custom metrics and dashboards, which is difficult or even impossible to achieve with SaaS solutions that companies usually start with. When you realize you need answers to similar interesting questions, this means it is time you consider custom business analytics.
Imagine the possibility of being able to pull in all that data in one place, visualize it, build and share custom interactive dashboards, slice and dice visualizations by various parameters (e.g. days, countries, products) or even build predictive models on top of it all. It is possible, but one requirement needs to be met first: You’ll need to take control of your own data.
Storing data is just the first step towards improving your business in a data-driven manner. The main advantage of collecting your own data as opposed to it sitting somewhere in the cloud behind some SaaS solution or being abstracted away from you using services such as Google Analytics is that you can use it to build recommender systems, churn prediction models or similar predictive models alongside the flexibility of altering your dashboards and metrics.
There are multiple ways of how one could store data from various sources to support this custom business analytics and predictive modelling, with data lake being one of the most beneficial approaches. It could be even argued that data lakes are an advanced step in the evolution of data processing.
To unlock all the value hidden in your data, you’ll need tools. The first and central to custom business analytics that comes to mind is Apache Superset - an open source business intelligence tool.
Apache Superset supports 30 types of visualizations, that are interactive and can be filtered in many fields of different types (e.g. iOS or Android users, age groups, gender, country, etc.). They require no coding, meaning that everyone can effectively create new visualizations without knowledge of SQL with full functionality accessible via buttons and other controls. Superset is also great for collaborating on data exploration because it is easy to share charts or entire dashboards, edit them and send back.
To support Superset and custom analytics you’ll need a whole pipeline to pull data from many different sources. On the data lake schema above it could be seen that another Apache project, Apache Kafka, could serve as an ingestion layer because it has a variety of connectors. Data could be cashed in different databases based on use case and choice of processing tool. The processing layer is there to do transformations and finally data is stored in one of databases compatible with Superset. No matter if it’s natively supported Druid, some relational DBs accessible with SQLAlchemy, like PostgreSQL, MySQL, SQLite, or distributed and scalable solutions like SparkSQL/Hive or ClickHouse, Superset makes visualization on these DBs easy.
If you are tired of confusing Google Analytics UI, visualizations that provide very limited interactivity and non-flexible dashboards that fulfill only the most basic stuff, then the customized business analytics solution is the right one for you. Start by collecting your data and use it to better your business. Get a deeper understanding of your customers via interactive data visualizations. Continuously analyze your customers’ behaviour and immediately take action on it. Enable anyone in your company to build beautiful reportings without writing a line of code.
If you wish to follow these stories, why not subscribe to our newsletter.
Python and R developer and once-upon-a-time Ruby fan. Worked on web and mobile, but also on streaming-data IoT projects: as an entrepreneur, developer and sometimes IT architect. Astrophysicist turned Bioinformatics Analyst turned Data Science generalist. Enjoys how domain-specific problem trigger his curiosity and problem-solving drive.