Text Search - quickstart
This tutorial creates a basic text search application based on the MS MARCO dataset, similar to Vespa’s text search tutorials.
Refer to troubleshooting for any problem when running this guide.
Install pyvespa and start Docker, validate minimum 4G available:
[ ]:
!docker info | grep "Total Memory"
Total Memory: 11.7GiB
Create an application package
The application package has all the Vespa configuration files - create one from scratch:
[ ]:
from vespa.package import ApplicationPackage
app_package = ApplicationPackage(name="textsearch")
Note that the name cannot have -
or _
.
The above will create an empty schema with the same name as the application package.
Add fields to the schema
[ ]:
from vespa.package import Field
app_package.schema.add_fields(
Field(name = "id", type = "string", indexing = ["attribute", "summary"]),
Field(name = "title", type = "string", indexing = ["index", "summary"], index = "enable-bm25"),
Field(name = "body", type = "string", indexing = ["index", "summary"], index = "enable-bm25")
)
id
holds the document ids, whiletitle
andbody
are the text fields of the documents.- Setting
"index"
inindexing
means that a searchable index fortitle
andbody
is created. Read more about indexing options. - Setting
index = "enable-bm25"
will pre-compute quantities to make it fast to compute the BM25 score.
Search multiple fields
A FieldSet groups fields together for searching - it configures queries to look for matches both in the title
and body
fields of the documents:
[ ]:
from vespa.package import FieldSet
app_package.schema.add_field_set(
FieldSet(name = "default", fields = ["title", "body"])
)
Define ranking
Specify how to rank the matched documents by defining a RankProfile. Below are different rank profiles that can be selected in the query:
[ ]:
from vespa.package import RankProfile
app_package.schema.add_rank_profile(
RankProfile(name = "bm25", first_phase = "bm25(title) + bm25(body)")
)
app_package.schema.add_rank_profile(
RankProfile(name = "native_rank", first_phase = "nativeRank(title, body)")
)
Deploy
The text search app with fields, a fieldset to group fields together, and rank profiles is now defined and ready to deploy. Deploy app_package
on the local machine using Docker, without leaving the notebook, by creating an instance of VespaDocker:
[ ]:
import os
from vespa.deployment import VespaDocker
vespa_docker = VespaDocker()
app = vespa_docker.deploy(application_package=app_package)
Waiting for configuration server, 0/300 seconds...
Waiting for configuration server, 5/300 seconds...
Waiting for application status, 0/300 seconds...
Waiting for application status, 5/300 seconds...
Waiting for application status, 10/300 seconds...
Finished deployment.
app
now holds a reference to a Vespa instance. pyvespa
provides an API to define Vespa application packages from python, defined above. VespaDocker.deploy deploys it to the Vespa instance
Feed
Download approx 10K documents:
[ ]:
from pandas import read_csv
docs = read_csv(
filepath_or_buffer="https://data.vespa.oath.cloud/blog/msmarco/sample_docs.csv"
).fillna('')
docs.head()
id | title | body | |
---|---|---|---|
0 | D1712962 | Can you eat crab or imitation krab when you ha... | Answers com Wiki Answers Categories Health... |
1 | D1817294 | How long is a tax refund check good | Answers com Wiki Answers Categories Busine... |
2 | D1761039 | The Suffolk Resolves 1774 | The Suffolk Resolves 1774 Across New England ... |
3 | D2899268 | The eagle has flown | Download citation Share Download full text PDF... |
4 | D3278481 | 22b Cotton and African American Life | 22b Cotton and African American Life Two thi... |
Feed the documents to the application:
[ ]:
feed_res = app.feed_df(docs, asynchronous=False, batch_size=1000)
Successful documents fed: 992/1000.
Batch progress: 1/10.
Successful documents fed: 992/1000.
Batch progress: 2/10.
Successful documents fed: 987/1000.
Batch progress: 3/10.
Successful documents fed: 992/1000.
Batch progress: 4/10.
Successful documents fed: 995/1000.
Batch progress: 5/10.
Successful documents fed: 993/1000.
Batch progress: 6/10.
Successful documents fed: 995/1000.
Batch progress: 7/10.
Successful documents fed: 991/1000.
Batch progress: 8/10.
Successful documents fed: 993/1000.
Batch progress: 9/10.
Successful documents fed: 957/963.
Batch progress: 10/10.
Query
Query the text search app using the Vespa Query language by sending the parameters to the body argument of Vespa.query - here using the bm25
rank profile:
[ ]:
query = {
'yql': 'select * from sources * where userQuery()',
'query': 'what keeps planes in the air',
'ranking': 'bm25',
'type': 'all',
'hits': 10
}
res = app.query(body=query)
res.hits[0]
{'id': 'id:textsearch:textsearch::D1871659',
'relevance': 25.629646778721742,
'source': 'textsearch_content',
'fields': {'sddocname': 'textsearch',
'documentid': 'id:textsearch:textsearch::D1871659',
'id': 'D1871659',
'title': 'What keeps airplanes in the air ',
'body': 'Answers com Wiki Answers Categories Cars Vehicles Airplanes and Aircraft What keeps airplanes in the air Flag What keeps airplanes in the air Answer by Karin L Confidence votes 95 0KThere s more to raising cattle than throwing them out to pasture Know your soil and plants to earn profit above ground and wealth below It is the combined forces of lift thrust and weight that keeps an airplane in the air Lift happens to be the largest force in this equation and is dependent on the speed of the wing or how fast an airplane is going vertical velocity of air and air density Well the elevator the rudder will help and something else I forgot what it was but don t judge me for that And that s how you be a bow done Like a boss Boss 15 people found this useful Was this answer useful Yes Somewhat No How do airplane windows keep out the cold Airplane windows The only way that heat can escape the warm cabin is to travel through something or radiate outward Since the windows are so small the radiation through Karin L There s more to raising cattle than throwing them out to pasture Know your soil and plants to earn profit above ground and wealth below Does speed keep an airplane in the air Yes to a degree speed is part of the equation Speed thrust and combine that with lift and weight though weight has to be smaller than lift and thrust combined then y Bala Surya 152 866 Contributions Adventurous Fun Dreaming High How does bernoulli s principle keep airplanes in the air Bernoulli s principle is that there is a region of high pressure under the wing So air rushes under the plane So it creates lift which in turn keeps the airplane in the air How airplane can fly in the air The airplane fly on the air by 4 main forces drag lift thrust and weight all these forces affect of the performances of the airplane to fly the high power of the e David Bäckman 388 346 Contributions Knowledge is a thing you can both share and keep Is the force that keeps an airplane in the air called lift or levitation Lift Sadia rulez 1 Contribution How does air help an airplane fly Air Helps An Aeroplane Fly Because Of The Up Thurst Up Thrust Is A Sort Of A Gravity That Pulls You Up Like A Float Floats In Water The Upthrust Pulls It Up But The Gravity Pull Djlax97 3 Contributions How do you keep your ears from popping on an airplane All you have to do is chew gum and swallow a lot Doing this has something to do with the place of your throat And yes it does work What does it mean when the air in airplanes are pressurized The air in an aircraft needs to be pressurised so that the people within the cabin don t pass out from oxygen starvation at higher altitudes The atmosphere can be described a Richard Loberger 26 278 Contributions Airplane can stop in the air It would depend on what you mean by stop in the air An airplane can have 0 MPH ground speed while in the air only IF the wind is going faster then the stall speed of the g How does an airplane stay stable in the air to keep a plane stable in the air it has different control surfaces or panels to allow the pilot to adjust the position of the plane in the air Some modern fighter jets such How do you recharge an airplane Air Conditioner An airplane airconditioner is completely different than the one in your house or car It doesn t rely on a refrigerant Rather it takes hot high pressure air from the hot comp What keeps an airplane up in the sky Bernoulli s Principle the statement that an increase in the speed of a fluid produces a decrease in pressure and a decrease in the speed produces an increase in pressure Win Karin L There s more to raising cattle than throwing them out to pasture Know your soil and plants to earn profit above ground and wealth below Answered In Physics What keeps the airplane from rolling unexpectedly On the tarmac there are triangular blocks that are placed in front and behind each wheel of the airplane called wheel chocks In the air a steady hand on the control sti David Bäckman 388 346 Contributions Knowledge is a thing you can both share and keep Answered In Airplanes and Aircraft What is a machine that keeps an airplane on course An auto pilot Charlie N 122 923 Contributions I have spent many years renovating buildings and leading a commercial handyman crew Answered In Airbus Machine that keeps an airplane on course Autopilot keeps an aircraft on course In modern times autopilot is assisted by GPS and radar Answered In Airplanes and Aircraft What keeps an airplane moving forward An engine producing THRUST keeps an airplane moving forward Types of engines used by airplanes include reciprocating engines turbo prop engines turbojet and turbofan engin Levyharaivan 396 Contributions Answered In Airplanes and Aircraft What keeps a airplane from rolling unexpectedly Brakes just like any other vehicle'}}
Cleanup
[ ]:
vespa_docker.container.stop()
vespa_docker.container.remove()