San Francisco, CA: DataStax Enterprise Search Using Apache Solr
Date(s) - January 19, 2015 - January 20, 2015
Time -
All Day

Description: This course teaches the essentials of implementing search functionality using Apache Solr, how to configure it within DataStax Enterprise (DSE), and how to troubleshoot and solve common problems.

Length: 2 days

Prerequisites: Anyone with a need to implement search functionality using DSE and Apache Solr.

Audience: Completion of the Apache Cassandra: Core Concepts, Skills, and Tools course, or equivalent practical experience with Apache Cassandra. Students should have some Java programming experience and be comfortable using Linux command line tools to successfully complete the learning exercises.

Environment: Virtual Machine pre-configured with DSE, related tooling, and exercise files.

Learning Objectives

Session 1: Solr Overview

  • Describe the Solr architecture
  • Identify key Solr benefits
  • Enumerate ways to run Solr
  • Demonstrate creating a search index

Session 2: DSE/Solr Overview

  • Describe how to run DSE with Solr
  • Describe the relationship between Cassandra and Solr
  • Demonstrate defining a Solr core
  • Explain how to add, modify and delete data

Session 3: Search Fundamentals

  • Explain inverted indexes
  • Describe Lucene indexes

Session 4: Solr Schemas

  • Identify common field types
  • Construct a Solr schema

Session 5: Text Analysis

  • Describe the analysis process
  • Explain index vs. query time analysis
  • Solve the multiple language problem

Session 6: Solr Configuration and Indexing

  • Describe the difference between soft and hard commits
  • Explain what happens during segment merging
  • List the common tuning parameters for indexing

Session 7: Solr Queries

  • Enumerate common request handlers
  • Explain the Lucene query syntax
  • List common query request parameters
  • Describe the response format

Session 8: DSE/Solr Deep Dive

  • List causes of Solr index synchronization issues
  • Demonstrate using CQL3 tables features in Solr
  • Describe techniques to improve indexing performance

Session 9: Scalable Search

  • Identify key features of DSE/Solr integration for search
  • Explain how Cassandra features provide reliability and scalability

Session 10: Relational Database

  • Enumerate key secondary index limitations
  • Identify when to use Solr for SQL-style queries
  • Assess when Solr queries are not viable as SQL equivalents

Session 11: Machine Learning

  • Identify use cases for Solr as a recommendation engine
  • Describe how to use queries to find similar items

Session 12: DSE Configuration

  • Identify the most common configuration problems
  • Demonstrate finding and fixing configuration issues
  • Explain why incorrect JVM sizes cause poor performance

Session 13: Solr Configuration

  • Demonstrate the key steps in capacity planning
  • Enumerate key Solr configuration settings

Session 14: Troubleshooting Search

  • Demonstrate using the Solr UI to find search problems
  • Describe how to use debug info to find query problems
  • List common reasons for searches to fail