25 Key Questions before you choose any stack for data analytics

- Pentaho

Stack for data analytics

  1. What is the data storage platform you would like to use?
  2. What is the Data Ingestion [ETL] tools you would like to use?
  3. What is the Data Processing tools you would like to use?
  4. What is the BI/Analytics tools you would like to use?
  5. Can the solution support analyzing large volume of history data?
  6. Can the solution handle high volume near real time feed?
  7. Will the solution support high volume of concurrent users access?
  8. Will the solution work on Android mobile/tablet friendly ?
  9. Can we do SaaS analytics as a service with multiple instances?
  10. Can we do OEM [Embedded Analytics] white labeling to go-to market?
  11. Does it have visual designer tools to avoid coding?
  12. Is it feasible to share/publish-consume data via APIs?
  13. Does it allow data security at the user/granular levels?
  14. Does it allow centralized authentication system with SSO?
  15. Is it flexible to integrate data science tools [like R/Python/Weka]?
  16. Does it provide administration and monitoring for batch job execution, user audit logs, etc?
  17. Does it scale for load balancing, job distribution?
  18. Does it provide pre-built connectors for traditional RDBMS and NoSQL/HDFS systems?
  19. Does it secure data during transformation and at rest?
  20. Does it support offline report generation?
  21. Does it support adhoc analysis using OLAP cubes with slice/dice, drill down, share?
  22. Does it support self service for reporting and analysis, dashboard?
  23. Does it allow 3rd party integrations for visualization like google maps, d3, mapbox, etc?
  24. Does it support semi-structured,  social/web data [logs, blog, docs, etc]?
  25. What are the data science tools it supports?