The objective of this exercise is to demonstrate how the migration of data from CSV to Oracle to Cassandra can change the required underlying storage volume of the data. How do data volumes change between pure ASCII source records versus records stored in tables in an Oracle database versus records stored in tables in a
Migrating Relational Data From Oracle To DSE/Cassandra – using Spark DataFrames, SparkSQL and the spark-cassandra connector
The objective of this exercise is to demonstrate how to migrate data from Oracle to DataStax Cassandra. I’ll be using the DataFrame capability introduced in Apache Spark 1.3 to load data from tables in an Oracle database (12c) via Oracle’s JDBC thin driver, to generate a result set, joining tables where necessary. The data will
DataStax 5.0 Multi Instance – OpsCenter And datastax-agent Installation And Configuration
DataStax 5.0 Multi Instance OpsCenter And datastax-agent The new Multi-Instance feature released with DSE 5.0 allows for the simple deployment of multiple DSE instances on a single machine. For the first part in this series on Multi-Instance look [here](https://github.com/simonambridge/DataStax-5.0-Multi-Instance-Demo). DataStax Multi-Instance documentation can be found [here](https://docs.datastax.com/en/latest-dse/datastax_enterprise/multiInstance/configMultiInstance.html) Install DataStax OpsCenter deb http://datastaxrepo_gmail.com:utJVKEg4lKeaWTX@debian.datastax.com/enterprise stable main curl -L
DSE 5.0 Multi-Instance Demo – Build and Configure
DSE 5.0 Multi-Instance Demo The new Multi-Instance feature released with DSE 5.0 allows for the simple deployment of multiple DSE instances on a single machine. This helps customer ensure that large hardware resources are effectively utilized for database applications, and making more efficient use of hardware can help reduce overall project costs. DataStax Multi-Instance documentation
Saving geoJSON Data To DSE/Cassandra Using User-Defined Types, Spark Dataframes and Spark SQL
The geoJSON data format is described at geojson.org as “a format for encoding a variety of geographic data structures“. In this example I’ll be using a set of oil/gas well data supplied by the State of Colorado describing approx 110,000 wells in the state. You can find a copy of this data at my GitHub