|
|
 |
 |
 |
manilaSuite.search.indexMessage
Observations shows that this is a very expensive script in large Manila websites. This script is called whenever a message is created or edited if the search engine is enabled for a website. There are two expensive regions -
1) a loop thru every date from 1904 to 2040 to see if a message is a news item for that day. 136 x 365 is slightly less than 50,000, which is a lot of effort to see if a news item links to a page
2) a walk through the site structure too determinee if the message number of the message we are indexing appears anywhere.
I know of no simple fix for the first problem, but the second can be alleviated by using Manila maintained paths table, which is a table of urls indexed by path number. Using this structure noticeably speeds up user response and reduces the impact of indexing on the server.
| on indexMessage (adrMsgTable, adrSite=nil) |
| |
local (title, url, text, siteName, siteUrl, createdDate, lastModDate) //page info |
| |
local (flStory = false, flHierarchyPage = false, flCalendarPage = false, flNewsItem = false) //Manila-specific info; 10/29/00 JES: added flNewsItem |
| |
local (msgNum = adrMsgTable^.msgNum) //the message number of this message |
| |
local (adrPrefsTable) //the address of the prefs table in the dg storage |
| |
local (adrConfig) //the address of the search config table in the #newsSite table for this site |
| |
local (domain, port, methodName, rpcPath) //info about the search engine indexing server |
| |
local (flIndex = false) //flag for indexing or not |
| |
on logNoIndex (message) //log the fact that this page can't be indexed |
| |
bundle //get the address of this Manila site |
| |
bundle //get basic info about the page |
| |
bundle //get the URL of this page |
| |
«It's either a dg message, story, #hierarchy page, news item, or in the calendar structure (a present or former home page). |
| |
if not defined (adrSite^.["#urls"].discussMsgReader) |
| |
url = adrSite^.["#urls"].discussMsgReader |
| |
if not (url endsWith "$") |
| |
«Find out if it's a story. |
| |
if defined (adrMsgTable^.alsoListedIn) |
| |
«Find out if it's a hierarchy page. |
| |
local (adrHierarchy = @adrSite^.["#hierarchy"]) |
| |
«if defined (adrHierarchy^) |
| |
if defined (adrHierarchy^) //JES 8/13/04: implement optimization, factoring from 02/06/17, 14:55:23 by DAB |
| |
manilaSuite.siteStructure.compileIfDirty (adrSite) |
| |
if defined (adrHierarchy^.paths.[msgNum]) |
| |
url = adrSite^.["#ftpSite"].url + string.popLeading (adrHierarchy^.paths.[msgNum], "/") |
| |
«Find out if it's a posted news item, get the URL, and fix text and lastModDate |
| |
«Find out if it's in the calendar structure |
| |
if not flNewsItem //10/29/00 JES: we already know the URL of newsitem-based homepages |
| |
bundle //decide whether or not index this page |
| |
bundle //get info about the search engine indexing server |
| |
bundle //send the page to the search engine indexing server |
| |
bundle //log the fact that the page was sent to the indexing server |
Relative to Frontier version 9.7b10
|