Big Data
From charlesreid1
Apache Bigtop (bundled ecosystem of software that is all intended to enable lots of software to work with Hadoop)
Apache Hadoop (cluster distributed-data framework, distributes data among node in a cluster, useful for data-intensive computing)
Apache Spark (cluster computing framework, performs computations on data, separate layer from Hadoop that can sit on top of Hadoop or can use some other cluster distributed-data framework; operates fast, designed to read data from cluster, perform operations, write results to cluster, all in one pass)
Apache MapReduce (similar to Spark, but operates differently - reads data from cluster, performs operation, writes results to cluster, reads updated data from cluster, performs operation, writes next results to cluster, etc.)
Apache Pig
Apache Hive
Apache HBase
Apache Mahout (general machine learning engine, like R but for big data sets; does not implement comprehensive ML algorithms; check Apache Spark MLlib for algorithms not implemented by Mahout)
Cassandra (distributed NoSQL database)
Apache TinkerPop/Gremlin (Apache TinkerPop and Gremlin are to graph databases what the JDBC and SQL are to relational databases)
MongoDB (NoSQL document-based database)
Apache CloudStack (software that enables management/deployment of large numbers of nodes or virtual machines; basically, this is the back-end software used to run a cloud service provider)
Apache Sqoop
Talend
Apache Hama
Cloudera Impala
Apache Drill
Gephi
Neo4j
Couchbase
Paradigm4 SciDB