Migrating A Large ASCII Data Set From CSV To Oracle To DSE/Cassandra – A look at storage impact

The objective of this exercise is to demonstrate how the migration of data from CSV to Oracle to Cassandra can change the required underlying storage volume of the data. How do data volumes change between pure ASCII source records versus records stored in tables in an Oracle database versus records stored in tables in a

Migrating Relational Data From Oracle To DSE/Cassandra – using Spark DataFrames, SparkSQL and the spark-cassandra connector

The objective of this exercise is to demonstrate how to migrate data from Oracle to DataStax Cassandra. I’ll be using the DataFrame capability introduced in Apache Spark 1.3 to load data from tables in an Oracle database (12c) via Oracle’s JDBC thin driver, to generate a result set, joining tables where necessary. The data will

Saving geoJSON Data To DSE/Cassandra Using User-Defined Types, Spark Dataframes and Spark SQL

The geoJSON data format is described at geojson.org as “a format for encoding a variety of geographic data structures“. In this example I’ll be using a set of oil/gas well data supplied by the State of Colorado describing approx 110,000 wells in the state. You can find a copy of this data at my GitHub