Vespa logo

Text Search - quickstart

This tutorial creates a basic text search application based on the MS MARCO dataset, similar to Vespa’s text search tutorials.

Refer to troubleshooting for any problem when running this guide.

Install pyvespa and start Docker, validate minimum 4G available:

[ ]:
!docker info | grep "Total Memory"
 Total Memory: 11.7GiB

Create an application package

The application package has all the Vespa configuration files - create one from scratch:

[ ]:
from vespa.package import ApplicationPackage

app_package = ApplicationPackage(name="textsearch")

Note that the name cannot have - or _.

The above will create an empty schema with the same name as the application package.

Add fields to the schema

Add fields to the schema:

[ ]:
from vespa.package import Field

app_package.schema.add_fields(
    Field(name = "id",    type = "string", indexing = ["attribute", "summary"]),
    Field(name = "title", type = "string", indexing = ["index", "summary"], index = "enable-bm25"),
    Field(name = "body",  type = "string", indexing = ["index", "summary"], index = "enable-bm25")
)
  • id holds the document ids, while title and body are the text fields of the documents.
  • Setting "index" in indexing means that a searchable index for title and body is created. Read more about indexing options.
  • Setting index = "enable-bm25" will pre-compute quantities to make it fast to compute the BM25 score.

Search multiple fields

A FieldSet groups fields together for searching - it configures queries to look for matches both in the title and body fields of the documents:

[ ]:
from vespa.package import FieldSet

app_package.schema.add_field_set(
    FieldSet(name = "default", fields = ["title", "body"])
)

Define ranking

Specify how to rank the matched documents by defining a RankProfile. Below are different rank profiles that can be selected in the query:

[ ]:
from vespa.package import RankProfile

app_package.schema.add_rank_profile(
    RankProfile(name = "bm25", first_phase = "bm25(title) + bm25(body)")
)
app_package.schema.add_rank_profile(
    RankProfile(name = "native_rank", first_phase = "nativeRank(title, body)")
)

Deploy

The text search app with fields, a fieldset to group fields together, and rank profiles is now defined and ready to deploy. Deploy app_package on the local machine using Docker, without leaving the notebook, by creating an instance of VespaDocker:

[ ]:
import os
from vespa.deployment import VespaDocker

vespa_docker = VespaDocker()
app = vespa_docker.deploy(application_package=app_package)
Waiting for configuration server, 0/300 seconds...
Waiting for configuration server, 5/300 seconds...
Waiting for application status, 0/300 seconds...
Waiting for application status, 5/300 seconds...
Waiting for application status, 10/300 seconds...
Finished deployment.

app now holds a reference to a Vespa instance. pyvespa provides an API to define Vespa application packages from python, defined above. VespaDocker.deploy deploys it to the Vespa instance

Feed

Download approx 10K documents:

[ ]:
from pandas import read_csv

docs = read_csv(
    filepath_or_buffer="https://data.vespa.oath.cloud/blog/msmarco/sample_docs.csv"
).fillna('')
docs.head()
id title body
0 D1712962 Can you eat crab or imitation krab when you ha... Answers com Wiki Answers Categories Health...
1 D1817294 How long is a tax refund check good Answers com Wiki Answers Categories Busine...
2 D1761039 The Suffolk Resolves 1774 The Suffolk Resolves 1774 Across New England ...
3 D2899268 The eagle has flown Download citation Share Download full text PDF...
4 D3278481 22b Cotton and African American Life 22b Cotton and African American Life Two thi...

Feed the documents to the application:

[ ]:
feed_res = app.feed_df(docs, asynchronous=False, batch_size=1000)
Successful documents fed: 992/1000.
Batch progress: 1/10.
Successful documents fed: 992/1000.
Batch progress: 2/10.
Successful documents fed: 987/1000.
Batch progress: 3/10.
Successful documents fed: 992/1000.
Batch progress: 4/10.
Successful documents fed: 995/1000.
Batch progress: 5/10.
Successful documents fed: 993/1000.
Batch progress: 6/10.
Successful documents fed: 995/1000.
Batch progress: 7/10.
Successful documents fed: 991/1000.
Batch progress: 8/10.
Successful documents fed: 993/1000.
Batch progress: 9/10.
Successful documents fed: 957/963.
Batch progress: 10/10.

Query

Query the text search app using the Vespa Query language by sending the parameters to the body argument of Vespa.query - here using the bm25 rank profile:

[ ]:
query = {
    'yql': 'select * from sources * where userQuery()',
    'query': 'what keeps planes in the air',
    'ranking': 'bm25',
    'type': 'all',
    'hits': 10
}
res = app.query(body=query)
res.hits[0]
{'id': 'id:textsearch:textsearch::D1871659',
 'relevance': 25.629646778721742,
 'source': 'textsearch_content',
 'fields': {'sddocname': 'textsearch',
  'documentid': 'id:textsearch:textsearch::D1871659',
  'id': 'D1871659',
  'title': 'What keeps airplanes in the air ',
  'body': 'Answers com   Wiki Answers   Categories Cars   Vehicles Airplanes and Aircraft What keeps airplanes in the air  Flag What keeps airplanes in the air  Answer by Karin L  Confidence votes 95 0KThere s more to raising cattle than throwing them out to pasture  Know your soil and plants to earn profit above ground and wealth below  It is the combined forces of lift  thrust and weight that keeps an airplane in the air  Lift happens to be the largest force in this equation  and is dependent on the speed of the wing  or how fast an airplane is going   vertical velocity of air and air density  Well the elevator the rudder will help and something else I forgot what it was but don t judge me for that               And that s how you be a bow done   Like a boss  Boss    15 people found this useful Was this answer useful  Yes Somewhat No How do airplane windows keep out the cold  Airplane windows   The only way that heat can escape the warm cabin is to travel through something or radiate outward  Since the windows are so small  the radiation through  Karin L  There s more to raising cattle than throwing them out to pasture  Know your soil and plants to earn profit above ground and wealth below  Does speed keep an airplane in the air  Yes  to a degree  speed is part of the equation  Speed   thrust  and combine that with lift and weight  though weight has to be smaller than lift and thrust combined     then y  Bala Surya 152 866 Contributions Adventurous  Fun   Dreaming High How does bernoulli s principle keep airplanes in the air  Bernoulli s principle is that there is a region of high pressure under the wing  So air rushes under the plane  So it creates lift which in turn keeps the airplane in the air   How airplane can fly in the air  The airplane fly on the air by 4 main forces   drag   lift   thrust and weight   all these forces affect of the performances of the airplane to fly     the high power of the e  David Bäckman 388 346 Contributions Knowledge is a thing you can both share and keep Is the force that keeps an airplane in the air called lift or levitation  Lift Sadia rulez 1 Contribution How does air help an airplane fly  Air Helps An Aeroplane Fly Because Of The Up Thurst  Up Thrust Is A Sort Of A Gravity That Pulls You Up Like A Float Floats In Water The Upthrust Pulls It Up But The Gravity Pull  Djlax97 3 Contributions How do you keep your ears from popping on an airplane  All you have to do is chew gum and swallow a lot Doing this has something to do with the place of your throat And yes it does work What does it mean when the air in airplanes are pressurized  The air in an aircraft needs to be pressurised so that the people within the cabin don t pass out from oxygen starvation at higher altitudes  The atmosphere can be described a  Richard Loberger 26 278 Contributions Airplane can stop in the air  It would depend on what you mean by stop in the air  An airplane can have 0 MPH ground speed while in the air only IF the wind is going faster then the stall speed of the g  How does an airplane stay stable in the air to keep a plane stable in the air it has different control surfaces or panels to allow the pilot to adjust the position of the plane in the air  Some modern fighter jets such  How do you recharge an airplane Air Conditioner  An airplane airconditioner is completely different than the one in your house or car  It doesn t rely on a refrigerant  Rather it takes hot high pressure air from the hot comp  What keeps an airplane up in the sky  Bernoulli s Principle the statement that an increase in the speed of a fluid produces a decrease in pressure and a decrease in the speed produces an increase in pressure Win  Karin L  There s more to raising cattle than throwing them out to pasture  Know your soil and plants to earn profit above ground and wealth below  Answered In Physics What keeps the airplane from rolling unexpectedly  On the tarmac  there are triangular blocks that are placed in front and behind each wheel of the airplane  called wheel chocks  In the air  a steady hand on the control sti  David Bäckman 388 346 Contributions Knowledge is a thing you can both share and keep Answered In Airplanes and Aircraft What is a machine that keeps an airplane on course  An auto pilot Charlie N  122 923 Contributions I have spent many years renovating buildings and leading a commercial handyman crew  Answered In Airbus Machine that keeps an airplane on course  Autopilot keeps an aircraft on course  In modern times autopilot is assisted by GPS and radar  Answered In Airplanes and Aircraft What keeps an airplane moving forward  An engine producing THRUST keeps an airplane moving forward  Types of engines used by airplanes include reciprocating engines  turbo prop engines  turbojet  and turbofan engin  Levyharaivan 396 Contributions Answered In Airplanes and Aircraft What keeps a airplane from rolling unexpectedly  Brakes  just like any other vehicle'}}

Cleanup

[ ]:
vespa_docker.container.stop()
vespa_docker.container.remove()

Next steps

This is just an intro into the capabilities of Vespa and pyvespa. Browse the site to learn more about schemas, feeding and queries - find more complex applications in examples.