Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Web Search Fails #1653

Open
gururise opened this issue Jan 16, 2025 · 4 comments
Open

Web Search Fails #1653

gururise opened this issue Jan 16, 2025 · 4 comments
Labels
bug Something isn't working

Comments

@gururise
Copy link
Contributor

gururise commented Jan 16, 2025

Bug description

Using Llama 3.3-70b (from TogetherAI) and enabling websearch tool. Using a SERPER_API_KEY then Websearch fails.

Steps to reproduce

  1. Configure .env.local for SERPER_API_KEY
  2. Select any model that supports function calling
  3. Enable Websearch
  4. PROMPT: tell me about the Prestige R2 PRO DTF Printer by DTF Station

Screenshots

Image

Logs

2025/01/16 12:47PM 50 pid=22 hostname=87bc9fc76379 err={"type":"Error","message":"Generation failed","stack":"Error: Generation failed\n    at generateFromDefaultEndpoint (file:///app/build/server/chunks/index3-7gUUM9Xd.js:1048:9)\n    at process.processTicksAndRejections (node:internal/process/task_queues:95:5)\n    at async getReturnFromGenerator (file:///app/build/server/chunks/index3-7gUUM9Xd.js:1053:14)\n    at async generateTitle (file:///app/build/server/chunks/_server.ts-Bfa2n-tw.js:216:10)\n    at async generateTitleForConversation (file:///app/build/server/chunks/_server.ts-Bfa2n-tw.js:180:19)"} msg=Generation failed

Failed to load page within 3.5s: https://dtfstation.com/products/prestige-r2-pro-dtf-printer

Failed to load page within 3.5s: https://dtfstation.com/collections/dtf-station-prestige-r2-pro-dtf-printer

Failed to load page within 3.5s: https://dtfstation.com/collections/dtf-station-prestige-r2-and-prestige-r2-pro

Failed to load page within 3.5s: https://dtfsuperstore.com/products/prestige-r2-pro-dtf-printer?srsltid=AfmBOoqb6k3n_sGdEckt7_OooXJa3HfnSbKD4S-snNk7chydhv4E0het

Failed to load page within 3.5s: https://dtfstation.com/products/prestige-r2-pro-shaker-bundle

Failed to load page within 3.5s: https://www.heatpressnation.com/products/prestige-r2-pro-dtf-printer?srsltid=AfmBOoqwiMXRMf9H2JH8ROzBuXCXkCcH-RIMGB44kiJDeAb-T3zFYuD4

Failed to load page within 3.5s: https://www.swingdesign.com/collections/prestige-r2-13-direct-to-film-dtf-printer?srsltid=AfmBOor0dSxUqz_ACVzpgemTnQ6aCnc6WcZavpDxHyPEhAnvi5Fqy-h4

Failed to load page within 3.5s: https://lawsonsp.com/products/r2-pro-dtf-printer?srsltid=AfmBOorCgs6K_-7yN1I9f_FoYcLZKepyRTId-3QPSD5MUwQva2Sdm0gY

2025/01/16 12:48PM 50 pid=22 hostname=87bc9fc76379 err={"type":"Error","message":"Failed to load page","stack":"Error: Failed to load page\n    at file:///app/build/server/chunks/index3-7gUUM9Xd.js:790:21\n    at withPage (file:///app/build/server/chunks/index3-7gUUM9Xd.js:210:18)\n    at runNextTicks (node:internal/process/task_queues:60:5)\n    at process.processImmediate (node:internal/timers:454:9)\n    at process.callbackTrampoline (node:internal/async_hooks:130:17)\n    at async file:///app/build/server/chunks/index3-7gUUM9Xd.js:776:18\n    at async mergeAsyncGenerators (file:///app/build/server/chunks/index3-7gUUM9Xd.js:1292:34)\n    at async runWebSearch (file:///app/build/server/chunks/index3-7gUUM9Xd.js:1322:29)\n    at async Object.call (file:///app/build/server/chunks/index3-7gUUM9Xd.js:1387:34)\n    at async callTool (file:///app/build/server/chunks/_server.ts-Bfa2n-tw.js:338:24)"} msg=Error scraping webpage: https://dtfstation.com/products/prestige-r2-pro-dtf-printer

2025/01/16 12:48PM 50 pid=22 hostname=87bc9fc76379 err={"type":"Error","message":"Failed to load page","stack":"Error: Failed to load page\n    at file:///app/build/server/chunks/index3-7gUUM9Xd.js:790:21\n    at withPage (file:///app/build/server/chunks/index3-7gUUM9Xd.js:210:18)\n    at runNextTicks (node:internal/process/task_queues:60:5)\n    at process.processImmediate (node:internal/timers:454:9)\n    at process.callbackTrampoline (node:internal/async_hooks:130:17)\n    at async file:///app/build/server/chunks/index3-7gUUM9Xd.js:776:18\n    at async mergeAsyncGenerators (file:///app/build/server/chunks/index3-7gUUM9Xd.js:1292:34)\n    at async runWebSearch (file:///app/build/server/chunks/index3-7gUUM9Xd.js:1322:29)\n    at async Object.call (file:///app/build/server/chunks/index3-7gUUM9Xd.js:1387:34)\n    at async callTool (file:///app/build/server/chunks/_server.ts-Bfa2n-tw.js:338:24)"} msg=Error scraping webpage: https://dtfstation.com/collections/dtf-station-prestige-r2-pro-dtf-printer

2025/01/16 12:48PM 50 pid=22 hostname=87bc9fc76379 err={"type":"Error","message":"Failed to load page","stack":"Error: Failed to load page\n    at file:///app/build/server/chunks/index3-7gUUM9Xd.js:790:21\n    at withPage (file:///app/build/server/chunks/index3-7gUUM9Xd.js:210:18)\n    at runNextTicks (node:internal/process/task_queues:60:5)\n    at process.processImmediate (node:internal/timers:454:9)\n    at process.callbackTrampoline (node:internal/async_hooks:130:17)\n    at async file:///app/build/server/chunks/index3-7gUUM9Xd.js:776:18\n    at async mergeAsyncGenerators (file:///app/build/server/chunks/index3-7gUUM9Xd.js:1292:34)\n    at async runWebSearch (file:///app/build/server/chunks/index3-7gUUM9Xd.js:1322:29)\n    at async Object.call (file:///app/build/server/chunks/index3-7gUUM9Xd.js:1387:34)\n    at async callTool (file:///app/build/server/chunks/_server.ts-Bfa2n-tw.js:338:24)"} msg=Error scraping webpage: https://dtfstation.com/collections/dtf-station-prestige-r2-and-prestige-r2-pro

2025/01/16 12:48PM 50 pid=22 hostname=87bc9fc76379 err={"type":"Error","message":"Failed to load page","stack":"Error: Failed to load page\n    at file:///app/build/server/chunks/index3-7gUUM9Xd.js:790:21\n    at withPage (file:///app/build/server/chunks/index3-7gUUM9Xd.js:210:18)\n    at runNextTicks (node:internal/process/task_queues:60:5)\n    at process.processImmediate (node:internal/timers:454:9)\n    at process.callbackTrampoline (node:internal/async_hooks:130:17)\n    at async file:///app/build/server/chunks/index3-7gUUM9Xd.js:776:18\n    at async mergeAsyncGenerators (file:///app/build/server/chunks/index3-7gUUM9Xd.js:1292:34)\n    at async runWebSearch (file:///app/build/server/chunks/index3-7gUUM9Xd.js:1322:29)\n    at async Object.call (file:///app/build/server/chunks/index3-7gUUM9Xd.js:1387:34)\n    at async callTool (file:///app/build/server/chunks/_server.ts-Bfa2n-tw.js:338:24)"} msg=Error scraping webpage: https://dtfsuperstore.com/products/prestige-r2-pro-dtf-printer?srsltid=AfmBOoqb6k3n_sGdEckt7_OooXJa3HfnSbKD4S-snNk7chydhv4E0het

2025/01/16 12:48PM 50 pid=22 hostname=87bc9fc76379 err={"type":"Error","message":"Failed to load page","stack":"Error: Failed to load page\n    at file:///app/build/server/chunks/index3-7gUUM9Xd.js:790:21\n    at withPage (file:///app/build/server/chunks/index3-7gUUM9Xd.js:210:18)\n    at runNextTicks (node:internal/process/task_queues:60:5)\n    at process.processImmediate (node:internal/timers:454:9)\n    at process.callbackTrampoline (node:internal/async_hooks:130:17)\n    at async file:///app/build/server/chunks/index3-7gUUM9Xd.js:776:18\n    at async mergeAsyncGenerators (file:///app/build/server/chunks/index3-7gUUM9Xd.js:1292:34)\n    at async runWebSearch (file:///app/build/server/chunks/index3-7gUUM9Xd.js:1322:29)\n    at async Object.call (file:///app/build/server/chunks/index3-7gUUM9Xd.js:1387:34)\n    at async callTool (file:///app/build/server/chunks/_server.ts-Bfa2n-tw.js:338:24)"} msg=Error scraping webpage: https://dtfstation.com/products/prestige-r2-pro-shaker-bundle

2025/01/16 12:48PM 50 pid=22 hostname=87bc9fc76379 err={"type":"Error","message":"Failed to load page","stack":"Error: Failed to load page\n    at file:///app/build/server/chunks/index3-7gUUM9Xd.js:790:21\n    at withPage (file:///app/build/server/chunks/index3-7gUUM9Xd.js:210:18)\n    at runNextTicks (node:internal/process/task_queues:60:5)\n    at process.processImmediate (node:internal/timers:454:9)\n    at process.callbackTrampoline (node:internal/async_hooks:130:17)\n    at async file:///app/build/server/chunks/index3-7gUUM9Xd.js:776:18\n    at async mergeAsyncGenerators (file:///app/build/server/chunks/index3-7gUUM9Xd.js:1292:34)\n    at async runWebSearch (file:///app/build/server/chunks/index3-7gUUM9Xd.js:1322:29)\n    at async Object.call (file:///app/build/server/chunks/index3-7gUUM9Xd.js:1387:34)\n    at async callTool (file:///app/build/server/chunks/_server.ts-Bfa2n-tw.js:338:24)"} msg=Error scraping webpage: https://www.heatpressnation.com/products/prestige-r2-pro-dtf-printer?srsltid=AfmBOoqwiMXRMf9H2JH8ROzBuXCXkCcH-RIMGB44kiJDeAb-T3zFYuD4

2025/01/16 12:48PM 50 pid=22 hostname=87bc9fc76379 err={"type":"Error","message":"Failed to load page","stack":"Error: Failed to load page\n    at file:///app/build/server/chunks/index3-7gUUM9Xd.js:790:21\n    at withPage (file:///app/build/server/chunks/index3-7gUUM9Xd.js:210:18)\n    at runNextTicks (node:internal/process/task_queues:60:5)\n    at process.processImmediate (node:internal/timers:454:9)\n    at process.callbackTrampoline (node:internal/async_hooks:130:17)\n    at async file:///app/build/server/chunks/index3-7gUUM9Xd.js:776:18\n    at async mergeAsyncGenerators (file:///app/build/server/chunks/index3-7gUUM9Xd.js:1292:34)\n    at async runWebSearch (file:///app/build/server/chunks/index3-7gUUM9Xd.js:1322:29)\n    at async Object.call (file:///app/build/server/chunks/index3-7gUUM9Xd.js:1387:34)\n    at async callTool (file:///app/build/server/chunks/_server.ts-Bfa2n-tw.js:338:24)"} msg=Error scraping webpage: https://www.swingdesign.com/collections/prestige-r2-13-direct-to-film-dtf-printer?srsltid=AfmBOor0dSxUqz_ACVzpgemTnQ6aCnc6WcZavpDxHyPEhAnvi5Fqy-h4

2025/01/16 12:48PM 50 pid=22 hostname=87bc9fc76379 err={"type":"Error","message":"Failed to load page","stack":"Error: Failed to load page\n    at file:///app/build/server/chunks/index3-7gUUM9Xd.js:790:21\n    at withPage (file:///app/build/server/chunks/index3-7gUUM9Xd.js:210:18)\n    at runNextTicks (node:internal/process/task_queues:60:5)\n    at process.processImmediate (node:internal/timers:454:9)\n    at process.callbackTrampoline (node:internal/async_hooks:130:17)\n    at async file:///app/build/server/chunks/index3-7gUUM9Xd.js:776:18\n    at async mergeAsyncGenerators (file:///app/build/server/chunks/index3-7gUUM9Xd.js:1292:34)\n    at async runWebSearch (file:///app/build/server/chunks/index3-7gUUM9Xd.js:1322:29)\n    at async Object.call (file:///app/build/server/chunks/index3-7gUUM9Xd.js:1387:34)\n    at async callTool (file:///app/build/server/chunks/_server.ts-Bfa2n-tw.js:338:24)"} msg=Error scraping webpage: https://lawsonsp.com/products/r2-pro-dtf-printer?srsltid=AfmBOorCgs6K_-7yN1I9f_FoYcLZKepyRTId-3QPSD5MUwQva2Sdm0gY

2025/01/16 12:48PM 50 pid=22 hostname=87bc9fc76379 msg=No text found in the first 8 results
// logs here if relevant

Specs

  • OS: node:20-slim (DOCKER)
  • Browser: Firefox
  • chat-ui commit: 1021e2f

Config

#ENABLE_LOCAL_FETCH=false
RATE_LIMIT=60
#WEBSEARCH_JAVASCRIPT=true
#USE_LOCAL_WEBSEARCH=false
SERPER_API_KEY=<REDACTED>
WEBSEARCH_ALLOWLIST=`[]` # if it's defined, allow websites from only this list.
WEBSEARCH_BLOCKLIST=`["youtube.com", "twitter.com"]` # if it's defined, block websites from this list.

LLM_SUMMARIZATION=true # generate conversation titles with LLMs
ENABLE_ASSISTANTS=true #set to true to enable assistants feature
ENABLE_ASSISTANTS_RAG=true # /!\ This will let users specify arbitrary URLs that the server will then request. Make sure you have the proper firewall r>
REQUIRE_FEATURED_ASSISTANTS=true # require featured assistants to show in the list
COMMUNITY_TOOLS=true # set to true to enable community tools
EXPOSE_API=true # make the /api routes available
ALLOW_IFRAME=true # Allow the app to be embedded in an iframe
@gururise gururise added the bug Something isn't working label Jan 16, 2025
@nsarrazin
Copy link
Collaborator

Seems like it's not loading the webpages, do you have some network settings that could be blocking things ?

Could you try something like curl -I https://dtfsuperstore.com inside of the docker container ?

@nsarrazin
Copy link
Collaborator

Alternatively if curl -I returns a 2XX code, you could try increasing the WEBSEARCH_TIMEOUT (default is 3500) ms

@gururise
Copy link
Contributor Author

gururise commented Jan 20, 2025

Alternatively if curl -I returns a 2XX code, you could try increasing the WEBSEARCH_TIMEOUT (default is 3500) ms

Looks like I'm getting a 2XX code back (directly from the docker container). I will try to increase the WEBSEARCH_TIMEOUT

HTTP/2 103 
link: <https://cdn.shopify.com>; rel=preconnect, <https://cdn.shopify.com>; crossorigin; rel=preconnect

HTTP/2 200 
date: Mon, 20 Jan 2025 18:29:57 GMT
content-type: text/html; charset=utf-8
x-sorting-hat-podid: 174
x-sorting-hat-shopid: 50255528111
x-storefront-renderer-rendered: 1
x-shopify-nginx-no-cookies: 1
set-cookie: secure_customer_sig=; path=/; expires=Tue, 20 Jan 2026 18:29:57 GMT; secure; HttpOnly; SameSite=Lax
set-cookie: localization=US; path=/; expires=Tue, 20 Jan 2026 18:29:57 GMT; SameSite=Lax
set-cookie: cart_currency=USD; path=/; expires=Mon, 03 Feb 2025 18:29:57 GMT; SameSite=Lax
set-cookie: _shopify_y=ee96cf9b-7a4d-4e22-a8d1-7c355d85b720; domain=dtfsuperstore.com; path=/; expires=Wed, 21 Jan 2026 00:29:57 GMT; SameSite=Lax
set-cookie: _shopify_s=9c92fc67-356f-4e85-8904-8a8b487b7681; domain=dtfsuperstore.com; path=/; expires=Wed, 22 Jan 2025 00:29:57 GMT; SameSite=Lax
set-cookie: _tracking_consent=%7B%22con%22%3A%7B%22CMP%22%3A%7B%22a%22%3A%22%22%2C%22m%22%3A%22%22%2C%22p%22%3A%22%22%2C%22s%22%3A%22%22%7D%7D%2C%22v%22%3A%222.1%22%2C%22region%22%3A%22USCA%22%2C%22reg%22%3A%22%22%2C%22purposes%22%3A%7B%22a%22%3Atrue%2C%22p%22%3Atrue%2C%22m%22%3Atrue%2C%22t%22%3Atrue%7D%2C%22display_banner%22%3Afalse%2C%22sale_of_data_region%22%3Afalse%2C%22consent_id%22%3A%2220720748-f9b8-4421-867c-af2c01718411%22%7D; domain=dtfsuperstore.com; path=/; expires=Tue, 20 Jan 2026 18:29:57 GMT; SameSite=Lax
set-cookie: _orig_referrer=; domain=dtfsuperstore.com; path=/; expires=Mon, 03 Feb 2025 18:29:57 GMT; HttpOnly; SameSite=Lax
set-cookie: _landing_page=%2F; domain=dtfsuperstore.com; path=/; expires=Mon, 03 Feb 2025 18:29:57 GMT; HttpOnly; SameSite=Lax
link: <https://cdn.shopify.com>; rel="preconnect", <https://cdn.shopify.com>; rel="preconnect"; crossorigin
etag: W/"cacheable:8c6fb1a555e8603c5e24c4304dc55e37"
x-cache: miss
x-frame-options: DENY
content-security-policy: block-all-mixed-content; frame-ancestors 'none'; upgrade-insecure-requests;
strict-transport-security: max-age=7889238
x-shopid: 50255528111
x-shardid: 174
vary: Accept
content-language: en-US
powered-by: Shopify
server-timing: processing;dur=399;desc="gc:35", db;dur=178, db_async;dur=2.411, render;dur=133, asn;desc="63473", edge;desc="LAX", country;desc="US", theme;desc="135336198319", pageType;desc="index", servedBy;desc="bcj8", requestID;desc="f339ccee-340b-41ae-a43f-d9f675170c4e-1737397796"
x-dc: gcp-us-west1,gcp-us-central1,gcp-us-central1
x-request-id: f339ccee-340b-41ae-a43f-d9f675170c4e-1737397796
alt-svc: h3=":443"; ma=86400
cf-cache-status: DYNAMIC
report-to: {"endpoints":[{"url":"https:\/\/a.nel.cloudflare.com\/report\/v4?s=67405xR9o6SBy%2BBJ0ieoOwrgd7g0s5OwPIEBOLJ8HxIDLYO23yX0t7oNwmrz0sF6dTkaTma2EtqtQgmLx6mPpThECfqxXQ8NUJ1mAgTd6Vv5xGVTrCynV1t37C9MTh9nukVn"}],"group":"cf-nel","max_age":604800}
nel: {"success_fraction":0.01,"report_to":"cf-nel","max_age":604800}
x-xss-protection: 1; mode=block
x-content-type-options: nosniff
x-permitted-cross-domain-policies: none
x-download-options: noopen
server: cloudflare
cf-ray: 90512204bdc7522d-LAX
server-timing: cfRequestDuration;dur=506.000042, earlyhints

@gururise
Copy link
Contributor Author

gururise commented Jan 21, 2025

Even after increasing the WEBSEARCH_TIMEOUT to 6000 ms, still getting the same results. Something is odd with that specific search query. It seems like it fails when extracting the text from certain webpages. As far as I can tell, it is likely a text extraction bug and not due to timeout...

If I try "What are today's current events", the search does not fail (even though it thinks today is Oct 2023):

Image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants