Exploring Solr dynamic fields in Sunspot

Indexing, filtering, sorting, and making coffee with dynamic fields

Posted on April 20, 2015

Sunspot provides a beauty Ruby API for searching and indexing documents in Solr Engine. Basically, a search document is represented by an ActiveRecord model and its attributes or instance methods are mapped to Solr schema fields.

Imagine a search application where coaches, for example, can search for soccer players to find their physical characteristics, technical skills and match statistics. Nice, right ? We start creating a search model like this:

class SoccerPlayer < ActiveRecord::Base
  searchable do
    text :name
    integer :age
    double :weight
    double :height
    string :main_position
    integer :total_goals
    time :last_injury_at
  end
end

Sunspot maps those model fields into Solr documents and updates them automatically. More information about defining searchable models, indexing, and other features can be find here.

Suppose we want to index and search for how good a player is for a skill. This information is different for each position (e.g., goal keeper, midfielder). First option: to index all skills for all players, even if a skill does not make much sense for a position. In this option, an attacker will have indexed a handling skill, which only makes sense for a goal keeper.

A more elegant solution is to use dynamic fields. With this, we can organize our data hierarchically and save index space, since we only store data that make sense for each player position. I will cover those points on the next sections.

Defining and indexing

We also define dynamic fields in a searchable scope:

class SoccerPlayer < ActiveRecord::Base
  searchable do
    dynamic_integer :tech_skill, stored: true do
      soccer_skills.inject({}) do |hash, e|
        hash.merge(e.name => e.level)
      end
    end
  end

  def soccer_skills
    TechnicalSkill.where("level > 0")
  end
end

  1. I've set a tech_skill dynamic field with stored: true which means that this field will be returned in our search result.
  2. We get the content of tech_shill by calling the soccer_skills method.
  3. We have to provide a hash to index the dynamic fields. So, we have the hash variable to aggregate all soccer skills. The key of each entry is a skill name and the value is the player level in that skill.
  4. Soccer_skills methods returns a collection of TechnicalSkill objects. This object has the name and level instance attributes.

Since we are getting data from a relational database, I've only retrieved those skills with level greater than 0. During the indexing, Sunspot applies all steps above for each soccer player instance. If a player does not have a specific skill, this skill won't be indexed in Solr engine.

For our Soccer Stats Manager application, dynamic fields can also be used to index physical or mental skills (e.g., acceleration, stamina, vision, aggression), or even statistics, such as goals or assistances performed in a particular league.

Searching/Filtering

After indexing our dynamic fields, we can make complex queries like filtering by a particular skill or a set of skills together.

SoccerPlayer.search do
  dynamic :tech_skill do
    with(:ball_control).greater_than(90)
    with(:dribbling).greater_than(92)
  end
end

In this search, we only want to find incredible soccer players with great ball control and dribbling skills.

Sorting

After indexing all soccer player of the major soccer leagues of the world, we may want to find the best players and then set up a dream team. Or even something less ambitious, like find best players for a particular category (e.g., younger than 20 years). In both cases, we need to sort our objects.

SoccerPlayer.search do
  dynamic :tech_skill do
    order_by(:ball_control, :desc)
  end
end

Here I'm searching for all soccer players and ranking them for ball_control skill decreasingly.

Getting results

Suppose, we ranked the soccer players by all technical and physical skills we have indexed. As a result we have a Sunspot::Search object. We can explore this object and find who is the best soccer player in the world !!!

best_player = @players.hits.first

puts "Name: #{best_player.stored(:name)}"
puts "Ball Control: #{best_player.stored(:tech_skill, :ball_control)}"
puts "Finishing: #{best_player.stored(:tech_skill, :finishing)}"
puts "Long Passing: #{best_player.stored(:tech_skill, :long_passing)}"
puts "Agility: #{best_player.stored(:physic_skill, :agility)}"
puts "Vision: #{best_player.stored(:mental_skill, :vision)}"

In the example above, name is a regular field while all other fields are dynamic fields. After running it, we probably get the following result:

Name: Lionel Messi
Ball Control: 99
Finishing: 100
Long Passing: 98
Agility: 100
Vision: 92