Text Search on Vespa Cloud - quickstart
This is the same guide as getting-started-pyvespa, deploying to Vespa Cloud.
Pre-requisite: Create a tenant at cloud.vespa.ai, save the tenant name.
Install
Install pyvespa >= 0.35 and the Vespa CLI. The Vespa CLI is used for key management:
[1]:
!pip3 install pyvespa
Requirement already satisfied: pyvespa in /home/docs/checkouts/readthedocs.org/user_builds/pyvespa/envs/latest/lib/python3.9/site-packages (0.7.0+dev)
Requirement already satisfied: requests in /home/docs/checkouts/readthedocs.org/user_builds/pyvespa/envs/latest/lib/python3.9/site-packages (from pyvespa) (2.31.0)
Requirement already satisfied: pandas in /home/docs/checkouts/readthedocs.org/user_builds/pyvespa/envs/latest/lib/python3.9/site-packages (from pyvespa) (2.1.0)
Requirement already satisfied: docker in /home/docs/checkouts/readthedocs.org/user_builds/pyvespa/envs/latest/lib/python3.9/site-packages (from pyvespa) (6.1.3)
Requirement already satisfied: jinja2 in /home/docs/checkouts/readthedocs.org/user_builds/pyvespa/envs/latest/lib/python3.9/site-packages (from pyvespa) (3.0.3)
Requirement already satisfied: cryptography in /home/docs/checkouts/readthedocs.org/user_builds/pyvespa/envs/latest/lib/python3.9/site-packages (from pyvespa) (41.0.4)
Requirement already satisfied: aiohttp in /home/docs/checkouts/readthedocs.org/user_builds/pyvespa/envs/latest/lib/python3.9/site-packages (from pyvespa) (3.8.5)
Requirement already satisfied: tenacity in /home/docs/checkouts/readthedocs.org/user_builds/pyvespa/envs/latest/lib/python3.9/site-packages (from pyvespa) (8.2.3)
Requirement already satisfied: typing-extensions in /home/docs/checkouts/readthedocs.org/user_builds/pyvespa/envs/latest/lib/python3.9/site-packages (from pyvespa) (4.8.0)
Requirement already satisfied: attrs>=17.3.0 in /home/docs/checkouts/readthedocs.org/user_builds/pyvespa/envs/latest/lib/python3.9/site-packages (from aiohttp->pyvespa) (23.1.0)
Requirement already satisfied: charset-normalizer<4.0,>=2.0 in /home/docs/checkouts/readthedocs.org/user_builds/pyvespa/envs/latest/lib/python3.9/site-packages (from aiohttp->pyvespa) (3.2.0)
Requirement already satisfied: multidict<7.0,>=4.5 in /home/docs/checkouts/readthedocs.org/user_builds/pyvespa/envs/latest/lib/python3.9/site-packages (from aiohttp->pyvespa) (6.0.4)
Requirement already satisfied: async-timeout<5.0,>=4.0.0a3 in /home/docs/checkouts/readthedocs.org/user_builds/pyvespa/envs/latest/lib/python3.9/site-packages (from aiohttp->pyvespa) (4.0.3)
Requirement already satisfied: yarl<2.0,>=1.0 in /home/docs/checkouts/readthedocs.org/user_builds/pyvespa/envs/latest/lib/python3.9/site-packages (from aiohttp->pyvespa) (1.9.2)
Requirement already satisfied: frozenlist>=1.1.1 in /home/docs/checkouts/readthedocs.org/user_builds/pyvespa/envs/latest/lib/python3.9/site-packages (from aiohttp->pyvespa) (1.4.0)
Requirement already satisfied: aiosignal>=1.1.2 in /home/docs/checkouts/readthedocs.org/user_builds/pyvespa/envs/latest/lib/python3.9/site-packages (from aiohttp->pyvespa) (1.3.1)
Requirement already satisfied: cffi>=1.12 in /home/docs/checkouts/readthedocs.org/user_builds/pyvespa/envs/latest/lib/python3.9/site-packages (from cryptography->pyvespa) (1.15.1)
Requirement already satisfied: packaging>=14.0 in /home/docs/checkouts/readthedocs.org/user_builds/pyvespa/envs/latest/lib/python3.9/site-packages (from docker->pyvespa) (23.1)
Requirement already satisfied: urllib3>=1.26.0 in /home/docs/checkouts/readthedocs.org/user_builds/pyvespa/envs/latest/lib/python3.9/site-packages (from docker->pyvespa) (2.0.4)
Requirement already satisfied: websocket-client>=0.32.0 in /home/docs/checkouts/readthedocs.org/user_builds/pyvespa/envs/latest/lib/python3.9/site-packages (from docker->pyvespa) (1.6.3)
Requirement already satisfied: idna<4,>=2.5 in /home/docs/checkouts/readthedocs.org/user_builds/pyvespa/envs/latest/lib/python3.9/site-packages (from requests->pyvespa) (3.4)
Requirement already satisfied: certifi>=2017.4.17 in /home/docs/checkouts/readthedocs.org/user_builds/pyvespa/envs/latest/lib/python3.9/site-packages (from requests->pyvespa) (2023.7.22)
Requirement already satisfied: MarkupSafe>=2.0 in /home/docs/checkouts/readthedocs.org/user_builds/pyvespa/envs/latest/lib/python3.9/site-packages (from jinja2->pyvespa) (2.1.3)
Requirement already satisfied: numpy>=1.22.4 in /home/docs/checkouts/readthedocs.org/user_builds/pyvespa/envs/latest/lib/python3.9/site-packages (from pandas->pyvespa) (1.26.0)
Requirement already satisfied: python-dateutil>=2.8.2 in /home/docs/checkouts/readthedocs.org/user_builds/pyvespa/envs/latest/lib/python3.9/site-packages (from pandas->pyvespa) (2.8.2)
Requirement already satisfied: pytz>=2020.1 in /home/docs/checkouts/readthedocs.org/user_builds/pyvespa/envs/latest/lib/python3.9/site-packages (from pandas->pyvespa) (2023.3.post1)
Requirement already satisfied: tzdata>=2022.1 in /home/docs/checkouts/readthedocs.org/user_builds/pyvespa/envs/latest/lib/python3.9/site-packages (from pandas->pyvespa) (2023.3)
Requirement already satisfied: pycparser in /home/docs/checkouts/readthedocs.org/user_builds/pyvespa/envs/latest/lib/python3.9/site-packages (from cffi>=1.12->cryptography->pyvespa) (2.21)
Requirement already satisfied: six>=1.5 in /home/docs/checkouts/readthedocs.org/user_builds/pyvespa/envs/latest/lib/python3.9/site-packages (from python-dateutil>=2.8.2->pandas->pyvespa) (1.16.0)
Install the Vespa CLI using homebrew:
[2]:
!brew install vespa-cli
/usr/bin/sh: 1: brew: not found
Alternatively, if running in Colab, download the Vespa CLI:
[3]:
import os
import requests
res = requests.get(url="https://api.github.com/repos/vespa-engine/vespa/releases/latest").json()
os.environ["VERSION"] = res["tag_name"].replace("v", "")
!curl -fsSL https://github.com/vespa-engine/vespa/releases/download/v${VERSION}/vespa-cli_${VERSION}_linux_amd64.tar.gz | tar -zxf -
!ln -sf /content/vespa-cli_${VERSION}_linux_amd64/bin/vespa /usr/local/bin/vespa
ln: failed to create symbolic link '/usr/local/bin/vespa': Permission denied
Configure application and keys
Create Vespa Cloud data-plane cert/key-pair. We save the paths to the credentials, for later dataplane access without using pyvespa APIs - see example at the end of this notebook.
[4]:
import os
os.environ["TENANT_NAME"] = "mytenant" # Your tenant name here
!vespa config set target cloud
!vespa config set application ${TENANT_NAME}.textsearch
!vespa auth cert -N
/usr/bin/sh: 1: vespa: not found
/usr/bin/sh: 1: vespa: not found
/usr/bin/sh: 1: vespa: not found
[5]:
cert_path = "/Users/me/.vespa/mytenant.textsearch.default/data-plane-public-cert.pem"
key_path = "/Users/me/.vespa/mytenant.textsearch.default/data-plane-private-key.pem"
Note that the subsequent deploy-call below will add data-plane-public-cert.pem
to the application before deploying it.
Authenticate to generate an API key for deployment, and save path for it:
[6]:
!vespa auth api-key
from pathlib import Path
api_key_path = str(Path.home()) + "/.vespa/" + os.getenv("TENANT_NAME") + ".api-key.pem"
/usr/bin/sh: 1: vespa: not found
Create an application package
The application package has all the Vespa configuration files - create one from scratch:
[7]:
from vespa.package import ApplicationPackage
app_name = "textsearch"
app_package = ApplicationPackage(name=app_name)
Note that the name cannot have -
or _
.
The above will create an empty schema with the same name as the application package.
Add fields to the schema
[8]:
from vespa.package import Field
app_package.schema.add_fields(
Field(name = "id", type = "string", indexing = ["attribute", "summary"]),
Field(name = "title", type = "string", indexing = ["index", "summary"], index = "enable-bm25"),
Field(name = "body", type = "string", indexing = ["index", "summary"], index = "enable-bm25")
)
id
holds the document ids, whiletitle
andbody
are the text fields of the documents.- Setting
"index"
inindexing
means that a searchable index fortitle
andbody
is created. Read more about indexing options. - Setting
index = "enable-bm25"
will pre-compute quantities to make it fast to compute the BM25 score.
Search multiple fields
A FieldSet groups fields together for searching - it configures queries to look for matches both in the title
and body
fields of the documents:
[9]:
from vespa.package import FieldSet
app_package.schema.add_field_set(
FieldSet(name = "default", fields = ["title", "body"])
)
Define ranking
Specify how to rank the matched documents by defining a RankProfile. Below are different rank profiles that can be selected in the query:
[10]:
from vespa.package import RankProfile
app_package.schema.add_rank_profile(
RankProfile(name = "bm25", first_phase = "bm25(title) + bm25(body)")
)
app_package.schema.add_rank_profile(
RankProfile(name = "native_rank", first_phase = "nativeRank(title, body)")
)
Deploy
The text search app with fields, a fieldset to group fields together, and rank profiles is now defined and ready to deploy. Deploy app_package
to Vespa Cloud, by creating an instance of VespaCloud:
[11]:
from vespa.deployment import VespaCloud
vespa_cloud = VespaCloud(
tenant=os.getenv("TENANT_NAME"),
application=app_name,
key_location=api_key_path,
application_package=app_package)
---------------------------------------------------------------------------
FileNotFoundError Traceback (most recent call last)
Cell In[11], line 3
1 from vespa.deployment import VespaCloud
----> 3 vespa_cloud = VespaCloud(
4 tenant=os.getenv("TENANT_NAME"),
5 application=app_name,
6 key_location=api_key_path,
7 application_package=app_package)
File ~/checkouts/readthedocs.org/user_builds/pyvespa/envs/latest/lib/python3.9/site-packages/vespa/deployment.py:389, in VespaCloud.__init__(self, tenant, application, application_package, key_location, key_content, output_file)
387 self.application = application
388 self.application_package = application_package
--> 389 self.api_key = self._read_private_key(key_location, key_content)
390 self.api_public_key_bytes = standard_b64encode(
391 self.api_key.public_key().public_bytes(
392 serialization.Encoding.PEM,
393 serialization.PublicFormat.SubjectPublicKeyInfo,
394 )
395 )
396 self.data_cert_path = None
File ~/checkouts/readthedocs.org/user_builds/pyvespa/envs/latest/lib/python3.9/site-packages/vespa/deployment.py:514, in VespaCloud._read_private_key(key_location, key_content)
512 key_content = bytes(key_content, "ascii")
513 elif key_location:
--> 514 with open(key_location, "rb") as key_data:
515 key_content = key_data.read()
516 else:
FileNotFoundError: [Errno 2] No such file or directory: '/home/docs/.vespa/mytenant.api-key.pem'
[12]:
app = vespa_cloud.deploy(instance="default")
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
Cell In[12], line 1
----> 1 app = vespa_cloud.deploy(instance="default")
NameError: name 'vespa_cloud' is not defined
app
now holds a reference to a Vespa instance. Store the endpoint for later usage - set endpoint
from this output:
[13]:
!vespa status container
/usr/bin/sh: 1: vespa: not found
[14]:
endpoint = "https://textsearch-container.textsearch.mytenant.aws-us-east-1c.dev.z.vespa-app.cloud/"
Feed
Download approx 10K documents:
[15]:
from pandas import read_csv
docs = read_csv(filepath_or_buffer="https://data.vespa.oath.cloud/blog/msmarco/sample_docs.csv").fillna('')
docs.head()
[15]:
id | title | body | |
---|---|---|---|
0 | D1712962 | Can you eat crab or imitation krab when you ha... | Answers com Wiki Answers Categories Health... |
1 | D1817294 | How long is a tax refund check good | Answers com Wiki Answers Categories Busine... |
2 | D1761039 | The Suffolk Resolves 1774 | The Suffolk Resolves 1774 Across New England ... |
3 | D2899268 | The eagle has flown | Download citation Share Download full text PDF... |
4 | D3278481 | 22b Cotton and African American Life | 22b Cotton and African American Life Two thi... |
Feed the documents to the application:
[16]:
feed_res = app.feed_df(docs, asynchronous=True, batch_size=100)
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
Cell In[16], line 1
----> 1 feed_res = app.feed_df(docs, asynchronous=True, batch_size=100)
NameError: name 'app' is not defined
Query
Query the text search app using the Vespa Query language by sending the parameters to the body argument of Vespa.query - here using the bm25
rank profile:
[17]:
query = {
'yql': 'select * from sources * where userQuery()',
'query': 'what keeps planes in the air',
'ranking': 'bm25',
'type': 'all',
'hits': 10
}
res = app.query(body=query)
res.hits[0]
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
Cell In[17], line 8
1 query = {
2 'yql': 'select * from sources * where userQuery()',
3 'query': 'what keeps planes in the air',
(...)
6 'hits': 10
7 }
----> 8 res = app.query(body=query)
9 res.hits[0]
NameError: name 'app' is not defined
Next steps
This is just an intro into the capabilities of Vespa and pyvespa. Browse the site to learn more about schemas, feeding and queries - find more complex applications in examples.
Example: Document operations using cert/key pair
Above, we deployed to Vespa Cloud, and as part of that, generated a cert/key pair. This pair can be used to access the dataplane for reads/writes to documents and running queries.
Find the ID of the first document in the feed:
[18]:
from vespa.application import df_to_vespafeed
import json
feed = json.loads(df_to_vespafeed(docs, app_name, "id", namespace=app_name))
doc_json = feed[0]
docid = doc_json["fields"]["id"]
doc_json
[18]:
{'id': 'id:textsearch:textsearch::D1712962',
'fields': {'id': 'D1712962',
'title': 'Can you eat crab or imitation krab when you have gout ',
'body': 'Answers com Wiki Answers Categories Health Conditions and Diseases Arthritis Gout Can you eat crab or imitation krab when you have gout Flag Can you eat crab or imitation krab when you have gout Answer by Robbb Confidence votes 100KBoth crab and imitation crab compressed krill would be bad for a sufferer of gout as they are high in Purines Below is a short guide on dietary needs for gout sufferers A low Purine Diet Will help those suffering from gout avoid atacks Here is a list of some of the foods that are high in purines These foods have the highest concentration of purines and should therefore be avoided by people that are prone to or are already suffering from gout Beef Pork Bacon Lamb Seafood Foods that are made with yeast like beer or bread Alcoholic beverages These foods while high in purine levels that can aggravate the symptoms of gout can be had occasionally by those who usually follow a low purine diet Asparagus Cauliflower Mushrooms Peas Spinach Whole grain breads and cereals White poultry meats like chicken duck or turkey Kidney and lima beans The following foods are safe to eat in a low purine diet however these should not be the only food you eat Green vegetables and tomatoes Fruits and fruit juices Breads that do not use yeast may be found in the kosher section or a health food store Nuts Milk and milk products such as butter and cheese Chocolate Coffee and tea 6 people found this useful Was this answer useful Yes Somewhat No Hcomeadow 14 Contributions Is imitation crab soaked in crab oil It is different for each brand but most brands are mostly fish pieces with lots of added artificial falvours colours preservatives and sugars Some brands are like 40 soya Giddy Up Go 2 Contributions Can you eat imitation crab meat during pregnancy While it is true that imitation crab meat does not contain high levels of mercury that doesn t mean it is unconditionally safe for pregnant women to eat Imitation crab meat Smartchic36 9 Contributions What do krabs eat Blue crabs are scavengers They eat edible wastes in the water Crabs can also be predators They eat shellfish beds and seaweed They will also eat snails mussels and decay Is it safe to eat imitation crab meat when you are pregnant nno way Suzzi Q 238 075 Contributions Retired with years of experience with Food FDA and internet surfing If a package of imitation crab is opened is it safe to eat up until the expiration date That will depend how it has been handled and stored The expiration date is usually intended for unopened packages Can you eat imitation crab during pregnancy yes you can just be careful with it if its already cured or such then it should be fine even normal crab is alright to eat as long as it has been cooked properly like any ot Can you eat imitation crab meat if you re allergic to real crab meat yes you can eat I m imitation crab meat if you are allergic to real Squirrel Man 234 785 Contributions If Mr Krabs is a crab then how is Pearl a whale Really there hasn t ever been an explanation for this in the show mainly due to the fact that Sponge Bob Square Pants is a kids show Though we can guess that Mr Krabs ad What does krab eats crackers with ketchup he calls it pizza though Box of Chocolates 636 Contributions How many calories are in imitation crab There are about 81 calories in a 3oz serving of imitation crabmeat Heliotrope 65 192 Contributions Does imitation crab meat have dye Yes red dye Count kryzal 96 Contributions Can a toddler eat imitation crab Yes But it is not crab It is usually Surami with additives Best to feed your child on natural organic food if you can Beansterman1998 20 Contributions Answered In Vegetarianism Can vegetarians eat imitation crab no it s a type of meat fish Answered In Cat Health Can cats eat imitation crab meat Cats can eat imitation crab meat In fact most cats like crab andother types of seafood especially varieties of fish Colin Opperman 15 Contributions Answered In Food Safety Is safe to eat imitation crab meat after it s been frozen 20 months Be sure to thaw and then fully cook the meat This should rid the meat of unwanted bacteria Answered In Crabs How do you eat imitation crab meat i don t Answered In Crabs Can you eat imitation crab meat raw Since its imitation crab and pre cooked it is ready to eat out of the package'}}
Set up a dataplane connection using the cert/key pair:
[19]:
import requests
session = requests.Session()
session.cert = (cert_path, key_path)
Get a document from the endpoint returned when we deployed to Vespa Cloud above:
[20]:
url = "{0}/document/v1/{1}/{2}/docid/{3}".format(endpoint, app_name, app_name, docid)
doc = session.get(url).json()
doc
---------------------------------------------------------------------------
OSError Traceback (most recent call last)
Cell In[20], line 2
1 url = "{0}/document/v1/{1}/{2}/docid/{3}".format(endpoint, app_name, app_name, docid)
----> 2 doc = session.get(url).json()
3 doc
File ~/checkouts/readthedocs.org/user_builds/pyvespa/envs/latest/lib/python3.9/site-packages/requests/sessions.py:602, in Session.get(self, url, **kwargs)
594 r"""Sends a GET request. Returns :class:`Response` object.
595
596 :param url: URL for the new :class:`Request` object.
597 :param \*\*kwargs: Optional arguments that ``request`` takes.
598 :rtype: requests.Response
599 """
601 kwargs.setdefault("allow_redirects", True)
--> 602 return self.request("GET", url, **kwargs)
File ~/checkouts/readthedocs.org/user_builds/pyvespa/envs/latest/lib/python3.9/site-packages/requests/sessions.py:589, in Session.request(self, method, url, params, data, headers, cookies, files, auth, timeout, allow_redirects, proxies, hooks, stream, verify, cert, json)
584 send_kwargs = {
585 "timeout": timeout,
586 "allow_redirects": allow_redirects,
587 }
588 send_kwargs.update(settings)
--> 589 resp = self.send(prep, **send_kwargs)
591 return resp
File ~/checkouts/readthedocs.org/user_builds/pyvespa/envs/latest/lib/python3.9/site-packages/requests/sessions.py:703, in Session.send(self, request, **kwargs)
700 start = preferred_clock()
702 # Send the request
--> 703 r = adapter.send(request, **kwargs)
705 # Total elapsed time of the request (approximately)
706 elapsed = preferred_clock() - start
File ~/checkouts/readthedocs.org/user_builds/pyvespa/envs/latest/lib/python3.9/site-packages/requests/adapters.py:458, in HTTPAdapter.send(self, request, stream, timeout, verify, cert, proxies)
455 except LocationValueError as e:
456 raise InvalidURL(e, request=request)
--> 458 self.cert_verify(conn, request.url, verify, cert)
459 url = self.request_url(request, proxies)
460 self.add_headers(
461 request,
462 stream=stream,
(...)
466 proxies=proxies,
467 )
File ~/checkouts/readthedocs.org/user_builds/pyvespa/envs/latest/lib/python3.9/site-packages/requests/adapters.py:285, in HTTPAdapter.cert_verify(self, conn, url, verify, cert)
283 conn.key_file = None
284 if conn.cert_file and not os.path.exists(conn.cert_file):
--> 285 raise OSError(
286 f"Could not find the TLS certificate file, "
287 f"invalid path: {conn.cert_file}"
288 )
289 if conn.key_file and not os.path.exists(conn.key_file):
290 raise OSError(
291 f"Could not find the TLS key file, invalid path: {conn.key_file}"
292 )
OSError: Could not find the TLS certificate file, invalid path: /Users/me/.vespa/mytenant.textsearch.default/data-plane-public-cert.pem
Update the title and post the new version:
[21]:
doc["fields"]["title"] = "Can you eat lobster?"
response = session.post(url, json=doc).json()
response
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
Cell In[21], line 1
----> 1 doc["fields"]["title"] = "Can you eat lobster?"
2 response = session.post(url, json=doc).json()
3 response
NameError: name 'doc' is not defined
Get the doc again to see the updated title:
[22]:
doc = session.get(url).json()
doc
---------------------------------------------------------------------------
OSError Traceback (most recent call last)
Cell In[22], line 1
----> 1 doc = session.get(url).json()
2 doc
File ~/checkouts/readthedocs.org/user_builds/pyvespa/envs/latest/lib/python3.9/site-packages/requests/sessions.py:602, in Session.get(self, url, **kwargs)
594 r"""Sends a GET request. Returns :class:`Response` object.
595
596 :param url: URL for the new :class:`Request` object.
597 :param \*\*kwargs: Optional arguments that ``request`` takes.
598 :rtype: requests.Response
599 """
601 kwargs.setdefault("allow_redirects", True)
--> 602 return self.request("GET", url, **kwargs)
File ~/checkouts/readthedocs.org/user_builds/pyvespa/envs/latest/lib/python3.9/site-packages/requests/sessions.py:589, in Session.request(self, method, url, params, data, headers, cookies, files, auth, timeout, allow_redirects, proxies, hooks, stream, verify, cert, json)
584 send_kwargs = {
585 "timeout": timeout,
586 "allow_redirects": allow_redirects,
587 }
588 send_kwargs.update(settings)
--> 589 resp = self.send(prep, **send_kwargs)
591 return resp
File ~/checkouts/readthedocs.org/user_builds/pyvespa/envs/latest/lib/python3.9/site-packages/requests/sessions.py:703, in Session.send(self, request, **kwargs)
700 start = preferred_clock()
702 # Send the request
--> 703 r = adapter.send(request, **kwargs)
705 # Total elapsed time of the request (approximately)
706 elapsed = preferred_clock() - start
File ~/checkouts/readthedocs.org/user_builds/pyvespa/envs/latest/lib/python3.9/site-packages/requests/adapters.py:458, in HTTPAdapter.send(self, request, stream, timeout, verify, cert, proxies)
455 except LocationValueError as e:
456 raise InvalidURL(e, request=request)
--> 458 self.cert_verify(conn, request.url, verify, cert)
459 url = self.request_url(request, proxies)
460 self.add_headers(
461 request,
462 stream=stream,
(...)
466 proxies=proxies,
467 )
File ~/checkouts/readthedocs.org/user_builds/pyvespa/envs/latest/lib/python3.9/site-packages/requests/adapters.py:285, in HTTPAdapter.cert_verify(self, conn, url, verify, cert)
283 conn.key_file = None
284 if conn.cert_file and not os.path.exists(conn.cert_file):
--> 285 raise OSError(
286 f"Could not find the TLS certificate file, "
287 f"invalid path: {conn.cert_file}"
288 )
289 if conn.key_file and not os.path.exists(conn.key_file):
290 raise OSError(
291 f"Could not find the TLS key file, invalid path: {conn.key_file}"
292 )
OSError: Could not find the TLS certificate file, invalid path: /Users/me/.vespa/mytenant.textsearch.default/data-plane-public-cert.pem
Example: Reconnect pyvespa using cert/key pair
Above, we stored the dataplane credentials for later use. Deployment of an application usually happens when the schema changes, whereas accessing the dataplane is for document updates and user queries.
One only needs to know the endpoint and the cert/key pair to enable a connection to a Vespa Cloud application:
[23]:
# cert_path = "/Users/me/.vespa/mytenant.textsearch.default/data-plane-public-cert.pem"
# key_path = "/Users/me/.vespa/mytenant.textsearch.default/data-plane-private-key.pem"
from vespa.application import Vespa
the_app = Vespa(endpoint, cert=cert_path, key=key_path)
res = the_app.query(body={
'yql': 'select * from sources * where true',
'hits': 1
})
res.hits[0]
---------------------------------------------------------------------------
OSError Traceback (most recent call last)
Cell In[23], line 8
4 from vespa.application import Vespa
6 the_app = Vespa(endpoint, cert=cert_path, key=key_path)
----> 8 res = the_app.query(body={
9 'yql': 'select * from sources * where true',
10 'hits': 1
11 })
12 res.hits[0]
File ~/checkouts/readthedocs.org/user_builds/pyvespa/envs/latest/lib/python3.9/site-packages/vespa/application.py:268, in Vespa.query(self, body)
259 """
260 Send a query request to the Vespa application.
261
(...)
265 :return: The response from the Vespa application.
266 """
267 with VespaSync(self) as sync_app:
--> 268 return sync_app.query(
269 body=body,
270 )
File ~/checkouts/readthedocs.org/user_builds/pyvespa/envs/latest/lib/python3.9/site-packages/vespa/application.py:904, in VespaSync.query(self, body)
891 def query(
892 self,
893 body: Optional[Dict] = None,
894 ) -> VespaQueryResponse:
895 """
896 Send a query request to the Vespa application.
897
(...)
902 :raises HTTPError: if one occurred
903 """
--> 904 response = self.http_session.post(self.app.search_end_point, json=body, cert=self.cert)
905 raise_for_status(response)
906 return VespaQueryResponse(
907 json=response.json(), status_code=response.status_code, url=str(response.url)
908 )
File ~/checkouts/readthedocs.org/user_builds/pyvespa/envs/latest/lib/python3.9/site-packages/requests/sessions.py:637, in Session.post(self, url, data, json, **kwargs)
626 def post(self, url, data=None, json=None, **kwargs):
627 r"""Sends a POST request. Returns :class:`Response` object.
628
629 :param url: URL for the new :class:`Request` object.
(...)
634 :rtype: requests.Response
635 """
--> 637 return self.request("POST", url, data=data, json=json, **kwargs)
File ~/checkouts/readthedocs.org/user_builds/pyvespa/envs/latest/lib/python3.9/site-packages/requests/sessions.py:589, in Session.request(self, method, url, params, data, headers, cookies, files, auth, timeout, allow_redirects, proxies, hooks, stream, verify, cert, json)
584 send_kwargs = {
585 "timeout": timeout,
586 "allow_redirects": allow_redirects,
587 }
588 send_kwargs.update(settings)
--> 589 resp = self.send(prep, **send_kwargs)
591 return resp
File ~/checkouts/readthedocs.org/user_builds/pyvespa/envs/latest/lib/python3.9/site-packages/requests/sessions.py:703, in Session.send(self, request, **kwargs)
700 start = preferred_clock()
702 # Send the request
--> 703 r = adapter.send(request, **kwargs)
705 # Total elapsed time of the request (approximately)
706 elapsed = preferred_clock() - start
File ~/checkouts/readthedocs.org/user_builds/pyvespa/envs/latest/lib/python3.9/site-packages/requests/adapters.py:458, in HTTPAdapter.send(self, request, stream, timeout, verify, cert, proxies)
455 except LocationValueError as e:
456 raise InvalidURL(e, request=request)
--> 458 self.cert_verify(conn, request.url, verify, cert)
459 url = self.request_url(request, proxies)
460 self.add_headers(
461 request,
462 stream=stream,
(...)
466 proxies=proxies,
467 )
File ~/checkouts/readthedocs.org/user_builds/pyvespa/envs/latest/lib/python3.9/site-packages/requests/adapters.py:285, in HTTPAdapter.cert_verify(self, conn, url, verify, cert)
283 conn.key_file = None
284 if conn.cert_file and not os.path.exists(conn.cert_file):
--> 285 raise OSError(
286 f"Could not find the TLS certificate file, "
287 f"invalid path: {conn.cert_file}"
288 )
289 if conn.key_file and not os.path.exists(conn.key_file):
290 raise OSError(
291 f"Could not find the TLS key file, invalid path: {conn.key_file}"
292 )
OSError: Could not find the TLS certificate file, invalid path: /Users/me/.vespa/mytenant.textsearch.default/data-plane-public-cert.pem
A common problem is a cert mismatch - the cert/key pair used when deployed is different than the pair used when making queries. Make sure it is the same pair / re-create with vespa auth cert -f
and redeploy as needed.