Vespa Cloud logo

Text Search on Vespa Cloud - quickstart

This is the same guide as getting-started-pyvespa, deploying to Vespa Cloud.

Refer to troubleshooting for any problem when running this guide.

Pre-requisite: Create a tenant at cloud.vespa.ai, save the tenant name.

Open In Colab

Install

Install pyvespa >= 0.35 and the Vespa CLI. The Vespa CLI is used for key management:

[1]:
!pip3 install pyvespa
Requirement already satisfied: pyvespa in /home/docs/checkouts/readthedocs.org/user_builds/pyvespa/envs/latest/lib/python3.9/site-packages (0.7.0+dev)
Requirement already satisfied: requests in /home/docs/checkouts/readthedocs.org/user_builds/pyvespa/envs/latest/lib/python3.9/site-packages (from pyvespa) (2.31.0)
Requirement already satisfied: pandas in /home/docs/checkouts/readthedocs.org/user_builds/pyvespa/envs/latest/lib/python3.9/site-packages (from pyvespa) (2.1.0)
Requirement already satisfied: docker in /home/docs/checkouts/readthedocs.org/user_builds/pyvespa/envs/latest/lib/python3.9/site-packages (from pyvespa) (6.1.3)
Requirement already satisfied: jinja2 in /home/docs/checkouts/readthedocs.org/user_builds/pyvespa/envs/latest/lib/python3.9/site-packages (from pyvespa) (3.0.3)
Requirement already satisfied: cryptography in /home/docs/checkouts/readthedocs.org/user_builds/pyvespa/envs/latest/lib/python3.9/site-packages (from pyvespa) (41.0.4)
Requirement already satisfied: aiohttp in /home/docs/checkouts/readthedocs.org/user_builds/pyvespa/envs/latest/lib/python3.9/site-packages (from pyvespa) (3.8.5)
Requirement already satisfied: tenacity in /home/docs/checkouts/readthedocs.org/user_builds/pyvespa/envs/latest/lib/python3.9/site-packages (from pyvespa) (8.2.3)
Requirement already satisfied: typing-extensions in /home/docs/checkouts/readthedocs.org/user_builds/pyvespa/envs/latest/lib/python3.9/site-packages (from pyvespa) (4.8.0)
Requirement already satisfied: attrs>=17.3.0 in /home/docs/checkouts/readthedocs.org/user_builds/pyvespa/envs/latest/lib/python3.9/site-packages (from aiohttp->pyvespa) (23.1.0)
Requirement already satisfied: charset-normalizer<4.0,>=2.0 in /home/docs/checkouts/readthedocs.org/user_builds/pyvespa/envs/latest/lib/python3.9/site-packages (from aiohttp->pyvespa) (3.2.0)
Requirement already satisfied: multidict<7.0,>=4.5 in /home/docs/checkouts/readthedocs.org/user_builds/pyvespa/envs/latest/lib/python3.9/site-packages (from aiohttp->pyvespa) (6.0.4)
Requirement already satisfied: async-timeout<5.0,>=4.0.0a3 in /home/docs/checkouts/readthedocs.org/user_builds/pyvespa/envs/latest/lib/python3.9/site-packages (from aiohttp->pyvespa) (4.0.3)
Requirement already satisfied: yarl<2.0,>=1.0 in /home/docs/checkouts/readthedocs.org/user_builds/pyvespa/envs/latest/lib/python3.9/site-packages (from aiohttp->pyvespa) (1.9.2)
Requirement already satisfied: frozenlist>=1.1.1 in /home/docs/checkouts/readthedocs.org/user_builds/pyvespa/envs/latest/lib/python3.9/site-packages (from aiohttp->pyvespa) (1.4.0)
Requirement already satisfied: aiosignal>=1.1.2 in /home/docs/checkouts/readthedocs.org/user_builds/pyvespa/envs/latest/lib/python3.9/site-packages (from aiohttp->pyvespa) (1.3.1)
Requirement already satisfied: cffi>=1.12 in /home/docs/checkouts/readthedocs.org/user_builds/pyvespa/envs/latest/lib/python3.9/site-packages (from cryptography->pyvespa) (1.15.1)
Requirement already satisfied: packaging>=14.0 in /home/docs/checkouts/readthedocs.org/user_builds/pyvespa/envs/latest/lib/python3.9/site-packages (from docker->pyvespa) (23.1)
Requirement already satisfied: urllib3>=1.26.0 in /home/docs/checkouts/readthedocs.org/user_builds/pyvespa/envs/latest/lib/python3.9/site-packages (from docker->pyvespa) (2.0.4)
Requirement already satisfied: websocket-client>=0.32.0 in /home/docs/checkouts/readthedocs.org/user_builds/pyvespa/envs/latest/lib/python3.9/site-packages (from docker->pyvespa) (1.6.3)
Requirement already satisfied: idna<4,>=2.5 in /home/docs/checkouts/readthedocs.org/user_builds/pyvespa/envs/latest/lib/python3.9/site-packages (from requests->pyvespa) (3.4)
Requirement already satisfied: certifi>=2017.4.17 in /home/docs/checkouts/readthedocs.org/user_builds/pyvespa/envs/latest/lib/python3.9/site-packages (from requests->pyvespa) (2023.7.22)
Requirement already satisfied: MarkupSafe>=2.0 in /home/docs/checkouts/readthedocs.org/user_builds/pyvespa/envs/latest/lib/python3.9/site-packages (from jinja2->pyvespa) (2.1.3)
Requirement already satisfied: numpy>=1.22.4 in /home/docs/checkouts/readthedocs.org/user_builds/pyvespa/envs/latest/lib/python3.9/site-packages (from pandas->pyvespa) (1.26.0)
Requirement already satisfied: python-dateutil>=2.8.2 in /home/docs/checkouts/readthedocs.org/user_builds/pyvespa/envs/latest/lib/python3.9/site-packages (from pandas->pyvespa) (2.8.2)
Requirement already satisfied: pytz>=2020.1 in /home/docs/checkouts/readthedocs.org/user_builds/pyvespa/envs/latest/lib/python3.9/site-packages (from pandas->pyvespa) (2023.3.post1)
Requirement already satisfied: tzdata>=2022.1 in /home/docs/checkouts/readthedocs.org/user_builds/pyvespa/envs/latest/lib/python3.9/site-packages (from pandas->pyvespa) (2023.3)
Requirement already satisfied: pycparser in /home/docs/checkouts/readthedocs.org/user_builds/pyvespa/envs/latest/lib/python3.9/site-packages (from cffi>=1.12->cryptography->pyvespa) (2.21)
Requirement already satisfied: six>=1.5 in /home/docs/checkouts/readthedocs.org/user_builds/pyvespa/envs/latest/lib/python3.9/site-packages (from python-dateutil>=2.8.2->pandas->pyvespa) (1.16.0)

Install the Vespa CLI using homebrew:

[2]:
!brew install vespa-cli
/usr/bin/sh: 1: brew: not found

Alternatively, if running in Colab, download the Vespa CLI:

[3]:
import os
import requests
res = requests.get(url="https://api.github.com/repos/vespa-engine/vespa/releases/latest").json()
os.environ["VERSION"] = res["tag_name"].replace("v", "")
!curl -fsSL https://github.com/vespa-engine/vespa/releases/download/v${VERSION}/vespa-cli_${VERSION}_linux_amd64.tar.gz | tar -zxf -
!ln -sf /content/vespa-cli_${VERSION}_linux_amd64/bin/vespa /usr/local/bin/vespa
ln: failed to create symbolic link '/usr/local/bin/vespa': Permission denied

Configure application and keys

Create Vespa Cloud data-plane cert/key-pair. We save the paths to the credentials, for later dataplane access without using pyvespa APIs - see example at the end of this notebook.

[4]:
import os
os.environ["TENANT_NAME"] = "mytenant" # Your tenant name here

!vespa config set target cloud
!vespa config set application ${TENANT_NAME}.textsearch
!vespa auth cert -N
/usr/bin/sh: 1: vespa: not found
/usr/bin/sh: 1: vespa: not found
/usr/bin/sh: 1: vespa: not found
[5]:
cert_path = "/Users/me/.vespa/mytenant.textsearch.default/data-plane-public-cert.pem"
key_path  = "/Users/me/.vespa/mytenant.textsearch.default/data-plane-private-key.pem"

Note that the subsequent deploy-call below will add data-plane-public-cert.pem to the application before deploying it.

Authenticate to generate an API key for deployment, and save path for it:

[6]:
!vespa auth api-key

from pathlib import Path
api_key_path = str(Path.home()) + "/.vespa/" + os.getenv("TENANT_NAME") + ".api-key.pem"
/usr/bin/sh: 1: vespa: not found

Create an application package

The application package has all the Vespa configuration files - create one from scratch:

[7]:
from vespa.package import ApplicationPackage

app_name = "textsearch"
app_package = ApplicationPackage(name=app_name)

Note that the name cannot have - or _.

The above will create an empty schema with the same name as the application package.

Add fields to the schema

Add fields to the schema:

[8]:
from vespa.package import Field

app_package.schema.add_fields(
    Field(name = "id",    type = "string", indexing = ["attribute", "summary"]),
    Field(name = "title", type = "string", indexing = ["index", "summary"], index = "enable-bm25"),
    Field(name = "body",  type = "string", indexing = ["index", "summary"], index = "enable-bm25")
)
  • id holds the document ids, while title and body are the text fields of the documents.
  • Setting "index" in indexing means that a searchable index for title and body is created. Read more about indexing options.
  • Setting index = "enable-bm25" will pre-compute quantities to make it fast to compute the BM25 score.

Search multiple fields

A FieldSet groups fields together for searching - it configures queries to look for matches both in the title and body fields of the documents:

[9]:
from vespa.package import FieldSet

app_package.schema.add_field_set(
    FieldSet(name = "default", fields = ["title", "body"])
)

Define ranking

Specify how to rank the matched documents by defining a RankProfile. Below are different rank profiles that can be selected in the query:

[10]:
from vespa.package import RankProfile

app_package.schema.add_rank_profile(
    RankProfile(name = "bm25", first_phase = "bm25(title) + bm25(body)")
)
app_package.schema.add_rank_profile(
    RankProfile(name = "native_rank", first_phase = "nativeRank(title, body)")
)

Deploy

The text search app with fields, a fieldset to group fields together, and rank profiles is now defined and ready to deploy. Deploy app_package to Vespa Cloud, by creating an instance of VespaCloud:

[11]:
from vespa.deployment import VespaCloud

vespa_cloud = VespaCloud(
    tenant=os.getenv("TENANT_NAME"),
    application=app_name,
    key_location=api_key_path,
    application_package=app_package)
---------------------------------------------------------------------------
FileNotFoundError                         Traceback (most recent call last)
Cell In[11], line 3
      1 from vespa.deployment import VespaCloud
----> 3 vespa_cloud = VespaCloud(
      4     tenant=os.getenv("TENANT_NAME"),
      5     application=app_name,
      6     key_location=api_key_path,
      7     application_package=app_package)

File ~/checkouts/readthedocs.org/user_builds/pyvespa/envs/latest/lib/python3.9/site-packages/vespa/deployment.py:389, in VespaCloud.__init__(self, tenant, application, application_package, key_location, key_content, output_file)
    387 self.application = application
    388 self.application_package = application_package
--> 389 self.api_key = self._read_private_key(key_location, key_content)
    390 self.api_public_key_bytes = standard_b64encode(
    391     self.api_key.public_key().public_bytes(
    392         serialization.Encoding.PEM,
    393         serialization.PublicFormat.SubjectPublicKeyInfo,
    394     )
    395 )
    396 self.data_cert_path = None

File ~/checkouts/readthedocs.org/user_builds/pyvespa/envs/latest/lib/python3.9/site-packages/vespa/deployment.py:514, in VespaCloud._read_private_key(key_location, key_content)
    512     key_content = bytes(key_content, "ascii")
    513 elif key_location:
--> 514     with open(key_location, "rb") as key_data:
    515         key_content = key_data.read()
    516 else:

FileNotFoundError: [Errno 2] No such file or directory: '/home/docs/.vespa/mytenant.api-key.pem'
[12]:
app = vespa_cloud.deploy(instance="default")
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
Cell In[12], line 1
----> 1 app = vespa_cloud.deploy(instance="default")

NameError: name 'vespa_cloud' is not defined

app now holds a reference to a Vespa instance. Store the endpoint for later usage - set endpoint from this output:

[13]:
!vespa status container
/usr/bin/sh: 1: vespa: not found
[14]:
endpoint = "https://textsearch-container.textsearch.mytenant.aws-us-east-1c.dev.z.vespa-app.cloud/"

Feed

Download approx 10K documents:

[15]:
from pandas import read_csv

docs = read_csv(filepath_or_buffer="https://data.vespa.oath.cloud/blog/msmarco/sample_docs.csv").fillna('')
docs.head()
[15]:
id title body
0 D1712962 Can you eat crab or imitation krab when you ha... Answers com Wiki Answers Categories Health...
1 D1817294 How long is a tax refund check good Answers com Wiki Answers Categories Busine...
2 D1761039 The Suffolk Resolves 1774 The Suffolk Resolves 1774 Across New England ...
3 D2899268 The eagle has flown Download citation Share Download full text PDF...
4 D3278481 22b Cotton and African American Life 22b Cotton and African American Life Two thi...

Feed the documents to the application:

[16]:
feed_res = app.feed_df(docs, asynchronous=True, batch_size=100)
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
Cell In[16], line 1
----> 1 feed_res = app.feed_df(docs, asynchronous=True, batch_size=100)

NameError: name 'app' is not defined

Query

Query the text search app using the Vespa Query language by sending the parameters to the body argument of Vespa.query - here using the bm25 rank profile:

[17]:
query = {
    'yql': 'select * from sources * where userQuery()',
    'query': 'what keeps planes in the air',
    'ranking': 'bm25',
    'type': 'all',
    'hits': 10
}
res = app.query(body=query)
res.hits[0]
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
Cell In[17], line 8
      1 query = {
      2     'yql': 'select * from sources * where userQuery()',
      3     'query': 'what keeps planes in the air',
   (...)
      6     'hits': 10
      7 }
----> 8 res = app.query(body=query)
      9 res.hits[0]

NameError: name 'app' is not defined

Next steps

This is just an intro into the capabilities of Vespa and pyvespa. Browse the site to learn more about schemas, feeding and queries - find more complex applications in examples.

Example: Document operations using cert/key pair

Above, we deployed to Vespa Cloud, and as part of that, generated a cert/key pair. This pair can be used to access the dataplane for reads/writes to documents and running queries.

Find the ID of the first document in the feed:

[18]:
from vespa.application import df_to_vespafeed
import json

feed = json.loads(df_to_vespafeed(docs, app_name, "id", namespace=app_name))
doc_json = feed[0]
docid = doc_json["fields"]["id"]
doc_json
[18]:
{'id': 'id:textsearch:textsearch::D1712962',
 'fields': {'id': 'D1712962',
  'title': 'Can you eat crab or imitation krab when you have gout ',
  'body': 'Answers com   Wiki Answers   Categories Health Conditions and Diseases Arthritis Gout Can you eat crab or imitation krab when you have gout  Flag Can you eat crab or imitation krab when you have gout  Answer by Robbb Confidence votes 100KBoth crab and imitation crab  compressed krill would be bad for a sufferer of gout as they are high in Purines  Below is a short guide on dietary needs for gout sufferers  A low Purine Diet Will help those suffering from gout avoid atacks  Here is a list of some of the foods that are high in purines  These foods have the highest concentration of purines and should therefore be avoided by people that are prone to or are already suffering from gout  Beef  Pork  Bacon  Lamb  Seafood  Foods that are made with yeast like beer or bread  Alcoholic beverages These foods  while high in purine levels that can aggravate the symptoms of gout can be had occasionally by those who usually follow a low purine diet  Asparagus  Cauliflower  Mushrooms  Peas  Spinach  Whole grain breads and cereals  White poultry meats  like chicken  duck or turkey  Kidney and lima beans  The following foods are safe to eat in a low purine diet  however these should not be the only food you eat  Green vegetables and tomatoes  Fruits and fruit juices  Breads that do not use yeast  may be found in the kosher section or a health food store   Nuts  Milk and milk products such as butter and cheese  Chocolate  Coffee and tea 6 people found this useful Was this answer useful  Yes Somewhat No Hcomeadow 14 Contributions Is imitation crab soaked in crab oil  It is different for each brand but most brands are mostly fish pieces with lots of added artificial falvours  colours  preservatives and sugars  Some brands are like 40  soya  Giddy Up Go 2 Contributions Can you eat imitation crab meat during pregnancy  While it is true that imitation crab meat does not contain high levels of mercury  that doesn t mean it is unconditionally safe for pregnant women to eat  Imitation crab meat  Smartchic36 9 Contributions What do krabs eat  Blue crabs are scavengers  They eat edible wastes in the water  Crabs can also be predators  They eat shellfish beds and seaweed  They will also eat snails  mussels  and decay  Is it safe to eat imitation crab meat when you are pregnant  nno way Suzzi Q 238 075 Contributions Retired with years of experience with Food  FDA and internet surfing  If a package of imitation crab is opened is it safe to eat up until the expiration date  That will depend how it has been handled and stored  The expiration date is usually intended for unopened packages  Can you eat imitation crab during pregnancy yes you can just be careful with it  if its already cured or such then it should be fine  even normal crab is alright to eat as long as it has been cooked properly like any ot  Can you eat imitation crab meat if you re allergic to real crab meat yes you can eat I m imitation crab meat if you are allergic to real Squirrel Man 234 785 Contributions If Mr  Krabs is a crab then how is Pearl a whale  Really  there hasn t ever been an explanation for this in the show  mainly due to the fact that Sponge Bob Square Pants is a kids  show   Though  we can guess that Mr  Krabs ad  What does krab eats crackers with ketchup  he calls it pizza though  Box of Chocolates 636 Contributions How many calories are in imitation crab  There are about 81 calories in a 3oz serving of imitation crabmeat  Heliotrope 65 192 Contributions Does imitation crab meat have dye  Yes  red dye  Count kryzal 96 Contributions Can a toddler eat imitation crab  Yes  But it is not crab  It is usually Surami with additives  Best to feed your child on natural  organic food if you can  Beansterman1998 20 Contributions Answered In Vegetarianism Can vegetarians eat imitation crab no  it s a type of meat fish Answered In Cat Health Can cats eat imitation crab meat  Cats can eat imitation crab meat  In fact  most cats like crab andother types of seafood  especially varieties of fish  Colin Opperman 15 Contributions Answered In Food Safety Is safe to eat imitation crab meat after it s been frozen 20 months  Be sure to thaw  and then fully cook the meat  This should rid the meat of unwanted bacteria  Answered In Crabs How do you eat imitation crab meat i don t Answered In Crabs Can you eat imitation crab meat raw  Since its imitation crab and pre cooked it is ready to eat out of the package'}}

Set up a dataplane connection using the cert/key pair:

[19]:
import requests

session = requests.Session()
session.cert = (cert_path, key_path)

Get a document from the endpoint returned when we deployed to Vespa Cloud above:

[20]:
url = "{0}/document/v1/{1}/{2}/docid/{3}".format(endpoint, app_name, app_name, docid)
doc = session.get(url).json()
doc
---------------------------------------------------------------------------
OSError                                   Traceback (most recent call last)
Cell In[20], line 2
      1 url = "{0}/document/v1/{1}/{2}/docid/{3}".format(endpoint, app_name, app_name, docid)
----> 2 doc = session.get(url).json()
      3 doc

File ~/checkouts/readthedocs.org/user_builds/pyvespa/envs/latest/lib/python3.9/site-packages/requests/sessions.py:602, in Session.get(self, url, **kwargs)
    594 r"""Sends a GET request. Returns :class:`Response` object.
    595
    596 :param url: URL for the new :class:`Request` object.
    597 :param \*\*kwargs: Optional arguments that ``request`` takes.
    598 :rtype: requests.Response
    599 """
    601 kwargs.setdefault("allow_redirects", True)
--> 602 return self.request("GET", url, **kwargs)

File ~/checkouts/readthedocs.org/user_builds/pyvespa/envs/latest/lib/python3.9/site-packages/requests/sessions.py:589, in Session.request(self, method, url, params, data, headers, cookies, files, auth, timeout, allow_redirects, proxies, hooks, stream, verify, cert, json)
    584 send_kwargs = {
    585     "timeout": timeout,
    586     "allow_redirects": allow_redirects,
    587 }
    588 send_kwargs.update(settings)
--> 589 resp = self.send(prep, **send_kwargs)
    591 return resp

File ~/checkouts/readthedocs.org/user_builds/pyvespa/envs/latest/lib/python3.9/site-packages/requests/sessions.py:703, in Session.send(self, request, **kwargs)
    700 start = preferred_clock()
    702 # Send the request
--> 703 r = adapter.send(request, **kwargs)
    705 # Total elapsed time of the request (approximately)
    706 elapsed = preferred_clock() - start

File ~/checkouts/readthedocs.org/user_builds/pyvespa/envs/latest/lib/python3.9/site-packages/requests/adapters.py:458, in HTTPAdapter.send(self, request, stream, timeout, verify, cert, proxies)
    455 except LocationValueError as e:
    456     raise InvalidURL(e, request=request)
--> 458 self.cert_verify(conn, request.url, verify, cert)
    459 url = self.request_url(request, proxies)
    460 self.add_headers(
    461     request,
    462     stream=stream,
   (...)
    466     proxies=proxies,
    467 )

File ~/checkouts/readthedocs.org/user_builds/pyvespa/envs/latest/lib/python3.9/site-packages/requests/adapters.py:285, in HTTPAdapter.cert_verify(self, conn, url, verify, cert)
    283     conn.key_file = None
    284 if conn.cert_file and not os.path.exists(conn.cert_file):
--> 285     raise OSError(
    286         f"Could not find the TLS certificate file, "
    287         f"invalid path: {conn.cert_file}"
    288     )
    289 if conn.key_file and not os.path.exists(conn.key_file):
    290     raise OSError(
    291         f"Could not find the TLS key file, invalid path: {conn.key_file}"
    292     )

OSError: Could not find the TLS certificate file, invalid path: /Users/me/.vespa/mytenant.textsearch.default/data-plane-public-cert.pem

Update the title and post the new version:

[21]:
doc["fields"]["title"] = "Can you eat lobster?"
response = session.post(url, json=doc).json()
response
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
Cell In[21], line 1
----> 1 doc["fields"]["title"] = "Can you eat lobster?"
      2 response = session.post(url, json=doc).json()
      3 response

NameError: name 'doc' is not defined

Get the doc again to see the updated title:

[22]:
doc = session.get(url).json()
doc
---------------------------------------------------------------------------
OSError                                   Traceback (most recent call last)
Cell In[22], line 1
----> 1 doc = session.get(url).json()
      2 doc

File ~/checkouts/readthedocs.org/user_builds/pyvespa/envs/latest/lib/python3.9/site-packages/requests/sessions.py:602, in Session.get(self, url, **kwargs)
    594 r"""Sends a GET request. Returns :class:`Response` object.
    595
    596 :param url: URL for the new :class:`Request` object.
    597 :param \*\*kwargs: Optional arguments that ``request`` takes.
    598 :rtype: requests.Response
    599 """
    601 kwargs.setdefault("allow_redirects", True)
--> 602 return self.request("GET", url, **kwargs)

File ~/checkouts/readthedocs.org/user_builds/pyvespa/envs/latest/lib/python3.9/site-packages/requests/sessions.py:589, in Session.request(self, method, url, params, data, headers, cookies, files, auth, timeout, allow_redirects, proxies, hooks, stream, verify, cert, json)
    584 send_kwargs = {
    585     "timeout": timeout,
    586     "allow_redirects": allow_redirects,
    587 }
    588 send_kwargs.update(settings)
--> 589 resp = self.send(prep, **send_kwargs)
    591 return resp

File ~/checkouts/readthedocs.org/user_builds/pyvespa/envs/latest/lib/python3.9/site-packages/requests/sessions.py:703, in Session.send(self, request, **kwargs)
    700 start = preferred_clock()
    702 # Send the request
--> 703 r = adapter.send(request, **kwargs)
    705 # Total elapsed time of the request (approximately)
    706 elapsed = preferred_clock() - start

File ~/checkouts/readthedocs.org/user_builds/pyvespa/envs/latest/lib/python3.9/site-packages/requests/adapters.py:458, in HTTPAdapter.send(self, request, stream, timeout, verify, cert, proxies)
    455 except LocationValueError as e:
    456     raise InvalidURL(e, request=request)
--> 458 self.cert_verify(conn, request.url, verify, cert)
    459 url = self.request_url(request, proxies)
    460 self.add_headers(
    461     request,
    462     stream=stream,
   (...)
    466     proxies=proxies,
    467 )

File ~/checkouts/readthedocs.org/user_builds/pyvespa/envs/latest/lib/python3.9/site-packages/requests/adapters.py:285, in HTTPAdapter.cert_verify(self, conn, url, verify, cert)
    283     conn.key_file = None
    284 if conn.cert_file and not os.path.exists(conn.cert_file):
--> 285     raise OSError(
    286         f"Could not find the TLS certificate file, "
    287         f"invalid path: {conn.cert_file}"
    288     )
    289 if conn.key_file and not os.path.exists(conn.key_file):
    290     raise OSError(
    291         f"Could not find the TLS key file, invalid path: {conn.key_file}"
    292     )

OSError: Could not find the TLS certificate file, invalid path: /Users/me/.vespa/mytenant.textsearch.default/data-plane-public-cert.pem

Example: Reconnect pyvespa using cert/key pair

Above, we stored the dataplane credentials for later use. Deployment of an application usually happens when the schema changes, whereas accessing the dataplane is for document updates and user queries.

One only needs to know the endpoint and the cert/key pair to enable a connection to a Vespa Cloud application:

[23]:
# cert_path = "/Users/me/.vespa/mytenant.textsearch.default/data-plane-public-cert.pem"
# key_path  = "/Users/me/.vespa/mytenant.textsearch.default/data-plane-private-key.pem"

from vespa.application import Vespa

the_app = Vespa(endpoint, cert=cert_path, key=key_path)

res = the_app.query(body={
    'yql': 'select * from sources * where true',
    'hits': 1
})
res.hits[0]
---------------------------------------------------------------------------
OSError                                   Traceback (most recent call last)
Cell In[23], line 8
      4 from vespa.application import Vespa
      6 the_app = Vespa(endpoint, cert=cert_path, key=key_path)
----> 8 res = the_app.query(body={
      9     'yql': 'select * from sources * where true',
     10     'hits': 1
     11 })
     12 res.hits[0]

File ~/checkouts/readthedocs.org/user_builds/pyvespa/envs/latest/lib/python3.9/site-packages/vespa/application.py:268, in Vespa.query(self, body)
    259 """
    260 Send a query request to the Vespa application.
    261
   (...)
    265 :return: The response from the Vespa application.
    266 """
    267 with VespaSync(self) as sync_app:
--> 268     return sync_app.query(
    269         body=body,
    270     )

File ~/checkouts/readthedocs.org/user_builds/pyvespa/envs/latest/lib/python3.9/site-packages/vespa/application.py:904, in VespaSync.query(self, body)
    891 def query(
    892     self,
    893     body: Optional[Dict] = None,
    894 ) -> VespaQueryResponse:
    895     """
    896     Send a query request to the Vespa application.
    897
   (...)
    902     :raises HTTPError: if one occurred
    903     """
--> 904     response = self.http_session.post(self.app.search_end_point, json=body, cert=self.cert)
    905     raise_for_status(response)
    906     return VespaQueryResponse(
    907         json=response.json(), status_code=response.status_code, url=str(response.url)
    908     )

File ~/checkouts/readthedocs.org/user_builds/pyvespa/envs/latest/lib/python3.9/site-packages/requests/sessions.py:637, in Session.post(self, url, data, json, **kwargs)
    626 def post(self, url, data=None, json=None, **kwargs):
    627     r"""Sends a POST request. Returns :class:`Response` object.
    628
    629     :param url: URL for the new :class:`Request` object.
   (...)
    634     :rtype: requests.Response
    635     """
--> 637     return self.request("POST", url, data=data, json=json, **kwargs)

File ~/checkouts/readthedocs.org/user_builds/pyvespa/envs/latest/lib/python3.9/site-packages/requests/sessions.py:589, in Session.request(self, method, url, params, data, headers, cookies, files, auth, timeout, allow_redirects, proxies, hooks, stream, verify, cert, json)
    584 send_kwargs = {
    585     "timeout": timeout,
    586     "allow_redirects": allow_redirects,
    587 }
    588 send_kwargs.update(settings)
--> 589 resp = self.send(prep, **send_kwargs)
    591 return resp

File ~/checkouts/readthedocs.org/user_builds/pyvespa/envs/latest/lib/python3.9/site-packages/requests/sessions.py:703, in Session.send(self, request, **kwargs)
    700 start = preferred_clock()
    702 # Send the request
--> 703 r = adapter.send(request, **kwargs)
    705 # Total elapsed time of the request (approximately)
    706 elapsed = preferred_clock() - start

File ~/checkouts/readthedocs.org/user_builds/pyvespa/envs/latest/lib/python3.9/site-packages/requests/adapters.py:458, in HTTPAdapter.send(self, request, stream, timeout, verify, cert, proxies)
    455 except LocationValueError as e:
    456     raise InvalidURL(e, request=request)
--> 458 self.cert_verify(conn, request.url, verify, cert)
    459 url = self.request_url(request, proxies)
    460 self.add_headers(
    461     request,
    462     stream=stream,
   (...)
    466     proxies=proxies,
    467 )

File ~/checkouts/readthedocs.org/user_builds/pyvespa/envs/latest/lib/python3.9/site-packages/requests/adapters.py:285, in HTTPAdapter.cert_verify(self, conn, url, verify, cert)
    283     conn.key_file = None
    284 if conn.cert_file and not os.path.exists(conn.cert_file):
--> 285     raise OSError(
    286         f"Could not find the TLS certificate file, "
    287         f"invalid path: {conn.cert_file}"
    288     )
    289 if conn.key_file and not os.path.exists(conn.key_file):
    290     raise OSError(
    291         f"Could not find the TLS key file, invalid path: {conn.key_file}"
    292     )

OSError: Could not find the TLS certificate file, invalid path: /Users/me/.vespa/mytenant.textsearch.default/data-plane-public-cert.pem

A common problem is a cert mismatch - the cert/key pair used when deployed is different than the pair used when making queries. Make sure it is the same pair / re-create with vespa auth cert -f and redeploy as needed.