space Baylys
Just practicing to pass the Turing test.
space
space
space
space
Developer for Hire!
space
addedValues Plugin
space

Home

What's new

Bayly.Root

Cornershop Plugin

Career

edutools Root

Enhancements

Interests

linguist Plugin

Manila

Patches

Patches by Group

Papers

Sales

Sign My Guestbook

User(land) Relations.

Contact Address

Search Baylys

urlchains

space
Join Now
Login
space space space

manilaSuite.search.indexMessage

Observations shows that this is a very expensive script in large Manila websites. This script is called whenever a message is created or edited if the search engine is enabled for a website. There are two expensive regions -

1) a loop thru every date from 1904 to 2040 to see if a message is a news item for that day. 136 x 365 is slightly less than 50,000, which is a lot of effort to see if a news item links to a page

2) a walk through the site structure too determinee if the message number of the message we are indexing appears anywhere.

I know of no simple fix for the first problem, but the second can be alleviated by using Manila maintained paths table, which is a table of urls indexed by path number. Using this structure noticeably speeds up user response and reduces the impact of indexing on the server.

on indexMessage (adrMsgTable, adrSite=nil)
  «Changes:
 unaltered lines omitted
 
  local (title, url, text, siteName, siteUrl, createdDate, lastModDate) //page info
  local (flStory = false, flHierarchyPage = false, flCalendarPage = false, flNewsItem = false) //Manila-specific info; 10/29/00 JES: added flNewsItem
  local (msgNum = adrMsgTable^.msgNum) //the message number of this message
  local (adrPrefsTable) //the address of the prefs table in the dg storage
  local (adrConfig) //the address of the search config table in the #newsSite table for this site
  local (domain, port, methodName, rpcPath) //info about the search engine indexing server
  local (flIndex = false) //flag for indexing or not
 
  sys.systemTask ()
  on logNoIndex (message) //log the fact that this page can't be indexed
 unaltered lines omitted
 
  bundle //get the address of this Manila site
 
  bundle //get basic info about the page
 
  bundle //get the URL of this page
  «It's either a dg message, story, #hierarchy page, news item, or in the calendar structure (a present or former home page).
 
  «Assume it's a message.
  if not defined (adrSite^.["#urls"].discussMsgReader)
 unaltered lines omitted
  url = adrSite^.["#urls"].discussMsgReader
  if not (url endsWith "$")
 unaltered lines omitted
  url = url + msgNum
 
  «Find out if it's a story.
 unaltered lines omitted
  if defined (adrMsgTable^.alsoListedIn)
 unaltered lines omitted
 
  «Find out if it's a hierarchy page.
  local (adrHierarchy = @adrSite^.["#hierarchy"])
  «if defined (adrHierarchy^)
 unaltered lines omitted
  if defined (adrHierarchy^) //JES 8/13/04: implement optimization, factoring from 02/06/17, 14:55:23 by DAB
  manilaSuite.siteStructure.compileIfDirty (adrSite)
  if defined (adrHierarchy^.paths.[msgNum])
  url = adrSite^.["#ftpSite"].url + string.popLeading (adrHierarchy^.paths.[msgNum], "/")
  flHierarchyPage = true
 
  «Find out if it's a posted news item, get the URL, and fix text and lastModDate
  bundle
 
  «Find out if it's in the calendar structure
  if not flNewsItem //10/29/00 JES: we already know the URL of newsitem-based homepages
 unaltered lines omitted
 
  bundle //decide whether or not index this page
 
  bundle //get info about the search engine indexing server
 
  bundle //send the page to the search engine indexing server
 
  bundle //log the fact that the page was sent to the indexing server
 
  return (true)

Relative to Frontier version 9.7b10