From charlesreid1

Goals and Motivation

Goals:

  • Implement real-time inventory tracking system that tracks locations
  • Perform data analytics on order and shipment logs (structured/unstructured data) to make decisions about deploying resources, targeting customers, and expanding into markets
  • Predict delays in shipments

Requirements:

  • Reliable, reproducible environment that scales
  • Aggregated data in centralized data lake
  • Historical data used to perform predictive analytics on future shipments
  • Accurate tracking of worldwide shipments (proprietary technology)
  • Improvement of business agility and speed of innovation via rapid provisioning of new resources
  • Analysis and optimization for performance in the cloud
  • Migration to cloud, if all other requirements met

Deeper reasoning:

  • Inability to upgrade infrastructure hampering growth and efficiency
  • Ineffective at moving data around
  • Need to better understand where/who customers are, what they are shipping
  • IT is too busy managing infrastructure to organize data/build analytics/implement tracking technology
  • Penalties for late shipments and deliveries translates into direct correlation between profitability and bottom line

Technology Stack

Databases:

  • SQL DB storing user data, static data
  • Cassandra DB storing metadata, tracking messages
  • Kafka servers tracking message aggregation and batch insert

Applications:

  • Customer frontend, middleware for orders and customs
  • Tomcat for Java services
  • Nginx for static content
  • Batch servers (?)

Storage:

  • iSCSI (internet small-computer-system interface) to manage VM hosts
  • Fiber channel network for SQL server storage
  • NAS (network attached storage) for image storage, logs, and backups

Analytics:

  • Hadoop/Spark servers
  • Core data lake
  • Data analysis workloads

Miscellaneous servers:

  • Jenkins
  • Monitoring of servers
  • Bastion hosts
  • Security scanners
  • Billing software

Using Google Cloud

Databases:

  • MySQL: Google Cloud offers the Cloud SQL service, and you can allocate a specific compute instance to run a MySQL (or Postgresql) server.
  • Cassandra: Google Cloud Launcher has several pre-configured solutions for different packages, including one for Cassandra.
  • Kafka: as with Cassandra, preconfigured Kafka instances are available through the Google Cloud Launcher.


Note: there is a huge list of all possible Google Cloud products to help figure out what products are used for what technologies.

List of Google Cloud products: https://cloud.google.com/products/

List of Google Cloud Launcher preconfigured machines: https://console.cloud.google.com/launcher