From nobody Sun Feb 8 13:13:15 2026 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of redhat.com designates 209.132.183.28 as permitted sender) client-ip=209.132.183.28; envelope-from=libvir-list-bounces@redhat.com; helo=mx1.redhat.com; Authentication-Results: mx.zohomail.com; spf=pass (zoho.com: domain of redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=libvir-list-bounces@redhat.com; dmarc=pass(p=none dis=none) header.from=redhat.com Return-Path: Received: from mx1.redhat.com (mx1.redhat.com [209.132.183.28]) by mx.zohomail.com with SMTPS id 1554330436089377.36811623952065; Wed, 3 Apr 2019 15:27:16 -0700 (PDT) Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.phx2.redhat.com [10.5.11.14]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id DB2B53164680; Wed, 3 Apr 2019 22:27:14 +0000 (UTC) Received: from colo-mx.corp.redhat.com (colo-mx01.intmail.prod.int.phx2.redhat.com [10.5.11.20]) by smtp.corp.redhat.com (Postfix) with ESMTPS id B4BB8179C9; Wed, 3 Apr 2019 22:27:14 +0000 (UTC) Received: from lists01.pubmisc.prod.ext.phx2.redhat.com (lists01.pubmisc.prod.ext.phx2.redhat.com [10.5.19.33]) by colo-mx.corp.redhat.com (Postfix) with ESMTP id 7755B181AC46; Wed, 3 Apr 2019 22:27:14 +0000 (UTC) Received: from smtp.corp.redhat.com (int-mx07.intmail.prod.int.phx2.redhat.com [10.5.11.22]) by lists01.pubmisc.prod.ext.phx2.redhat.com (8.13.8/8.13.8) with ESMTP id x33MQxBp020291 for ; Wed, 3 Apr 2019 18:26:59 -0400 Received: by smtp.corp.redhat.com (Postfix) id 9239310A33C4; Wed, 3 Apr 2019 22:26:59 +0000 (UTC) Received: from worklaptop.redhat.com (ovpn-120-38.rdu2.redhat.com [10.10.120.38]) by smtp.corp.redhat.com (Postfix) with ESMTP id 123D410A33C1; Wed, 3 Apr 2019 22:26:58 +0000 (UTC) From: Cole Robinson To: libvirt-list@redhat.com Date: Wed, 3 Apr 2019 18:26:50 -0400 Message-Id: <5a008deb0273da0274c1b0b0f8a89e36887fff00.1554329992.git.crobinso@redhat.com> In-Reply-To: References: MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.84 on 10.5.11.22 X-loop: libvir-list@redhat.com Subject: [libvirt] [PATCH 2/3] docs: Remove index.py X-BeenThere: libvir-list@redhat.com X-Mailman-Version: 2.1.12 Precedence: junk List-Id: Development discussions about the libvirt library & tools List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Transfer-Encoding: quoted-printable Sender: libvir-list-bounces@redhat.com Errors-To: libvir-list-bounces@redhat.com X-Scanned-By: MIMEDefang 2.79 on 10.5.11.14 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.41]); Wed, 03 Apr 2019 22:27:15 +0000 (UTC) Content-Type: text/plain; charset="utf-8" This was used for generating the website search, which now just calls out to google. Remove it Signed-off-by: Cole Robinson Reviewed-by: Daniel P. Berrang=C3=A9 --- docs/index.py | 1266 ------------------------------------------------- 1 file changed, 1266 deletions(-) delete mode 100755 docs/index.py diff --git a/docs/index.py b/docs/index.py deleted file mode 100755 index 0d07ca4d05..0000000000 --- a/docs/index.py +++ /dev/null @@ -1,1266 +0,0 @@ -#!/usr/bin/env python2 -# -# imports the API description and fills up a database with -# name relevance to modules, functions or web pages -# -# Operation needed: -# =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D -# -# install mysqld, the python wrappers for mysql and libxml2, start mysqld -# - mysql-server -# - mysql -# - php-mysql -# - MySQL-python -# Change the root passwd of mysql: -# mysqladmin -u root password new_password -# Create the new database libvir -# mysqladmin -p create libvir -# Create a database user 'veillard' and give him password access -# change veillard and abcde with the right user name and passwd -# mysql -p -# password: -# mysql> GRANT ALL PRIVILEGES ON libvir TO veillard@localhost -# IDENTIFIED BY 'abcde' WITH GRANT OPTION; -# mysql> GRANT ALL PRIVILEGES ON libvir.* TO veillard@localhost -# IDENTIFIED BY 'abcde' WITH GRANT OPTION; -# -# As the user check the access: -# mysql -p libvir -# Enter password: -# Welcome to the MySQL monitor.... -# mysql> use libvir -# Database changed -# mysql> quit -# Bye -# -# Then run the script in the doc subdir, it will create the symbols and -# word tables and populate them with information extracted from -# the libvirt-api.xml API description, and make then accessible read-only -# by nobody@loaclhost the user expected to be Apache's one -# -# On the Apache configuration, make sure you have php support enabled -# - -import MySQLdb -import libxml2 -import sys -import string -import os - -# -# We are not interested in parsing errors here -# -def callback(ctx, str): - return -libxml2.registerErrorHandler(callback, None) - -# -# The dictionary of tables required and the SQL command needed -# to create them -# -TABLES =3D { - "symbols": """CREATE TABLE symbols ( - name varchar(255) BINARY NOT NULL, - module varchar(255) BINARY NOT NULL, - type varchar(25) NOT NULL, - descr varchar(255), - UNIQUE KEY name (name), - KEY module (module))""", - "words": """CREATE TABLE words ( - name varchar(50) BINARY NOT NULL, - symbol varchar(255) BINARY NOT NULL, - relevance int, - KEY name (name), - KEY symbol (symbol), - UNIQUE KEY ID (name, symbol))""", - "wordsHTML": """CREATE TABLE wordsHTML ( - name varchar(50) BINARY NOT NULL, - resource varchar(255) BINARY NOT NULL, - section varchar(255), - id varchar(50), - relevance int, - KEY name (name), - KEY resource (resource), - UNIQUE KEY ref (name, resource))""", - "wordsArchive": """CREATE TABLE wordsArchive ( - name varchar(50) BINARY NOT NULL, - ID int(11) NOT NULL, - relevance int, - KEY name (name), - UNIQUE KEY ref (name, ID))""", - "pages": """CREATE TABLE pages ( - resource varchar(255) BINARY NOT NULL, - title varchar(255) BINARY NOT NULL, - UNIQUE KEY name (resource))""", - "archives": """CREATE TABLE archives ( - ID int(11) NOT NULL auto_increment, - resource varchar(255) BINARY NOT NULL, - title varchar(255) BINARY NOT NULL, - UNIQUE KEY id (ID,resource(255)), - INDEX (ID), - INDEX (resource))""", - "Queries": """CREATE TABLE Queries ( - ID int(11) NOT NULL auto_increment, - Value varchar(50) NOT NULL, - Count int(11) NOT NULL, - UNIQUE KEY id (ID,Value(35)), - INDEX (ID))""", - "AllQueries": """CREATE TABLE AllQueries ( - ID int(11) NOT NULL auto_increment, - Value varchar(50) NOT NULL, - Count int(11) NOT NULL, - UNIQUE KEY id (ID,Value(35)), - INDEX (ID))""", -} - -# -# The XML API description file to parse -# -API =3D "libvirt-api.xml" -DB =3D None - -######################################################################### -# # -# MySQL database interfaces # -# # -######################################################################### -def createTable(db, name): - global TABLES - - if db is None: - return -1 - if name is None: - return -1 - c =3D db.cursor() - - ret =3D c.execute("DROP TABLE IF EXISTS %s" % (name)) - if ret =3D=3D 1: - print "Removed table %s" % (name) - print "Creating table %s" % (name) - try: - ret =3D c.execute(TABLES[name]) - except: - print "Failed to create table %s" % (name) - return -1 - return ret - -def checkTables(db, verbose=3D1): - global TABLES - - if db is None: - return -1 - c =3D db.cursor() - nbtables =3D c.execute("show tables") - if verbose: - print "Found %d tables" % (nbtables) - tables =3D {} - i =3D 0 - while i < nbtables: - l =3D c.fetchone() - name =3D l[0] - tables[name] =3D {} - i =3D i + 1 - - for table in TABLES.keys(): - if not tables.has_key(table): - print "table %s missing" % (table) - createTable(db, table) - try: - ret =3D c.execute("SELECT count(*) from %s" % table) - row =3D c.fetchone() - if verbose: - print "Table %s contains %d records" % (table, row[0]) - except: - print "Troubles with table %s: repairing" % (table) - ret =3D c.execute("repair table %s" % table) - print "repairing returned %d" % (ret) - ret =3D c.execute("SELECT count(*) from %s" % table) - row =3D c.fetchone() - print "Table %s contains %d records" % (table, row[0]) - if verbose: - print "checkTables finished" - - # make sure apache can access the tables read-only - try: - ret =3D c.execute("GRANT SELECT ON libvir.* TO nobody@localhost") - ret =3D c.execute("GRANT INSERT,SELECT,UPDATE ON libvir.Queries T= O nobody@localhost") - except: - pass - return 0 - -def openMySQL(db=3D"libvir", passwd=3DNone, verbose=3D1): - global DB - - if passwd is None: - try: - passwd =3D os.environ["MySQL_PASS"] - except: - print "No password available, set environment MySQL_PASS" - sys.exit(1) - - DB =3D MySQLdb.connect(passwd=3Dpasswd, db=3Ddb) - if DB is None: - return -1 - ret =3D checkTables(DB, verbose) - return ret - -def updateWord(name, symbol, relevance): - global DB - - if DB is None: - openMySQL() - if DB is None: - return -1 - if name is None: - return -1 - if symbol is None: - return -1 - - c =3D DB.cursor() - try: - ret =3D c.execute( -"""INSERT INTO words (name, symbol, relevance) VALUES ('%s','%s', %d)""" % - (name, symbol, relevance)) - except: - try: - ret =3D c.execute( - """UPDATE words SET relevance =3D %d where name =3D '%s' and symbol = =3D '%s'""" % - (relevance, name, symbol)) - except: - print "Update word (%s, %s, %s) failed command" % (name, symbo= l, relevance) - print "UPDATE words SET relevance =3D %d where name =3D '%s' a= nd symbol =3D '%s'" % (relevance, name, symbol) - print sys.exc_type, sys.exc_value - return -1 - - return ret - -def updateSymbol(name, module, type, desc): - global DB - - updateWord(name, name, 50) - if DB is None: - openMySQL() - if DB is None: - return -1 - if name is None: - return -1 - if module is None: - return -1 - if type is None: - return -1 - - try: - desc =3D string.replace(desc, "'", " ") - l =3D string.split(desc, ".") - desc =3D l[0] - desc =3D desc[0:99] - except: - desc =3D "" - - c =3D DB.cursor() - try: - ret =3D c.execute( -"""INSERT INTO symbols (name, module, type, descr) VALUES ('%s','%s', '%s'= , '%s')""" % - (name, module, type, desc)) - except: - try: - ret =3D c.execute( -"""UPDATE symbols SET module=3D'%s', type=3D'%s', descr=3D'%s' where name= =3D'%s'""" % - (module, type, desc, name)) - except: - print "Update symbol (%s, %s, %s) failed command" % (name, mod= ule, type) - print """UPDATE symbols SET module=3D'%s', type=3D'%s', descr= =3D'%s' where name=3D'%s'""" % (module, type, desc, name) - print sys.exc_type, sys.exc_value - return -1 - - return ret - -def addFunction(name, module, desc=3D""): - return updateSymbol(name, module, 'function', desc) - -def addMacro(name, module, desc=3D""): - return updateSymbol(name, module, 'macro', desc) - -def addEnum(name, module, desc=3D""): - return updateSymbol(name, module, 'enum', desc) - -def addStruct(name, module, desc=3D""): - return updateSymbol(name, module, 'struct', desc) - -def addConst(name, module, desc=3D""): - return updateSymbol(name, module, 'const', desc) - -def addType(name, module, desc=3D""): - return updateSymbol(name, module, 'type', desc) - -def addFunctype(name, module, desc=3D""): - return updateSymbol(name, module, 'functype', desc) - -def addPage(resource, title): - global DB - - if DB is None: - openMySQL() - if DB is None: - return -1 - if resource is None: - return -1 - - c =3D DB.cursor() - try: - ret =3D c.execute( - """INSERT INTO pages (resource, title) VALUES ('%s','%s')""" % - (resource, title)) - except: - try: - ret =3D c.execute( - """UPDATE pages SET title=3D'%s' WHERE resource=3D'%s'""" % - (title, resource)) - except: - print "Update symbol (%s, %s, %s) failed command" % (name, mod= ule, type) - print """UPDATE pages SET title=3D'%s' WHERE resource=3D'%s'""= " % (title, resource) - print sys.exc_type, sys.exc_value - return -1 - - return ret - -def updateWordHTML(name, resource, desc, id, relevance): - global DB - - if DB is None: - openMySQL() - if DB is None: - return -1 - if name is None: - return -1 - if resource is None: - return -1 - if id is None: - id =3D "" - if desc is None: - desc =3D "" - else: - try: - desc =3D string.replace(desc, "'", " ") - desc =3D desc[0:99] - except: - desc =3D "" - - c =3D DB.cursor() - try: - ret =3D c.execute( -"""INSERT INTO wordsHTML (name, resource, section, id, relevance) VALUES (= '%s','%s', '%s', '%s', '%d')""" % - (name, resource, desc, id, relevance)) - except: - try: - ret =3D c.execute( -"""UPDATE wordsHTML SET section=3D'%s', id=3D'%s', relevance=3D'%d' where = name=3D'%s' and resource=3D'%s'""" % - (desc, id, relevance, name, resource)) - except: - print "Update symbol (%s, %s, %d) failed command" % (name, res= ource, relevance) - print """UPDATE wordsHTML SET section=3D'%s', id=3D'%s', relev= ance=3D'%d' where name=3D'%s' and resource=3D'%s'""" % (desc, id, relevance= , name, resource) - print sys.exc_type, sys.exc_value - return -1 - - return ret - -def checkXMLMsgArchive(url): - global DB - - if DB is None: - openMySQL() - if DB is None: - return -1 - if url is None: - return -1 - - c =3D DB.cursor() - try: - ret =3D c.execute( - """SELECT ID FROM archives WHERE resource=3D'%s'""" % (url)) - row =3D c.fetchone() - if row is None: - return -1 - except: - return -1 - - return row[0] - -def addXMLMsgArchive(url, title): - global DB - - if DB is None: - openMySQL() - if DB is None: - return -1 - if url is None: - return -1 - if title is None: - title =3D "" - else: - title =3D string.replace(title, "'", " ") - title =3D title[0:99] - - c =3D DB.cursor() - try: - cmd =3D """INSERT INTO archives (resource, title) VALUES ('%s','%s= ')""" % (url, title) - ret =3D c.execute(cmd) - cmd =3D """SELECT ID FROM archives WHERE resource=3D'%s'""" % (url) - ret =3D c.execute(cmd) - row =3D c.fetchone() - if row is None: - print "addXMLMsgArchive failed to get the ID: %s" % (url) - return -1 - except: - print "addXMLMsgArchive failed command: %s" % (cmd) - return -1 - - return((int)(row[0])) - -def updateWordArchive(name, id, relevance): - global DB - - if DB is None: - openMySQL() - if DB is None: - return -1 - if name is None: - return -1 - if id is None: - return -1 - - c =3D DB.cursor() - try: - ret =3D c.execute( -"""INSERT INTO wordsArchive (name, id, relevance) VALUES ('%s', '%d', '%d'= )""" % - (name, id, relevance)) - except: - try: - ret =3D c.execute( -"""UPDATE wordsArchive SET relevance=3D'%d' where name=3D'%s' and ID=3D'%d= '""" % - (relevance, name, id)) - except: - print "Update word archive (%s, %d, %d) failed command" % (nam= e, id, relevance) - print """UPDATE wordsArchive SET relevance=3D'%d' where name= =3D'%s' and ID=3D'%d'""" % (relevance, name, id) - print sys.exc_type, sys.exc_value - return -1 - - return ret - -######################################################################### -# # -# Word dictionary and analysis routines # -# # -######################################################################### - -# -# top 100 english word without the one len < 3 + own set -# -dropWords =3D { - 'the':0, 'this':0, 'can':0, 'man':0, 'had':0, 'him':0, 'only':0, - 'and':0, 'not':0, 'been':0, 'other':0, 'even':0, 'are':0, 'was':0, - 'new':0, 'most':0, 'but':0, 'when':0, 'some':0, 'made':0, 'from':0, - 'who':0, 'could':0, 'after':0, 'that':0, 'will':0, 'time':0, 'also':0, - 'have':0, 'more':0, 'these':0, 'did':0, 'was':0, 'two':0, 'many':0, - 'they':0, 'may':0, 'before':0, 'for':0, 'which':0, 'out':0, 'then':0, - 'must':0, 'one':0, 'through':0, 'with':0, 'you':0, 'said':0, - 'first':0, 'back':0, 'were':0, 'what':0, 'any':0, 'years':0, 'his':0, - 'her':0, 'where':0, 'all':0, 'its':0, 'now':0, 'much':0, 'she':0, - 'about':0, 'such':0, 'your':0, 'there':0, 'into':0, 'like':0, 'may':0, - 'would':0, 'than':0, 'our':0, 'well':0, 'their':0, 'them':0, 'over':0, - 'down':0, - 'net':0, 'www':0, 'bad':0, 'Okay':0, 'bin':0, 'cur':0, -} - -wordsDict =3D {} -wordsDictHTML =3D {} -wordsDictArchive =3D {} - -def cleanupWordsString(str): - str =3D string.replace(str, ".", " ") - str =3D string.replace(str, "!", " ") - str =3D string.replace(str, "?", " ") - str =3D string.replace(str, ",", " ") - str =3D string.replace(str, "'", " ") - str =3D string.replace(str, '"', " ") - str =3D string.replace(str, ";", " ") - str =3D string.replace(str, "(", " ") - str =3D string.replace(str, ")", " ") - str =3D string.replace(str, "{", " ") - str =3D string.replace(str, "}", " ") - str =3D string.replace(str, "<", " ") - str =3D string.replace(str, ">", " ") - str =3D string.replace(str, "=3D", " ") - str =3D string.replace(str, "/", " ") - str =3D string.replace(str, "*", " ") - str =3D string.replace(str, ":", " ") - str =3D string.replace(str, "#", " ") - str =3D string.replace(str, "\\", " ") - str =3D string.replace(str, "\n", " ") - str =3D string.replace(str, "\r", " ") - str =3D string.replace(str, "\xc2", " ") - str =3D string.replace(str, "\xa0", " ") - return str - -def cleanupDescrString(str): - str =3D string.replace(str, "'", " ") - str =3D string.replace(str, "\n", " ") - str =3D string.replace(str, "\r", " ") - str =3D string.replace(str, "\xc2", " ") - str =3D string.replace(str, "\xa0", " ") - l =3D string.split(str) - str =3D string.join(str) - return str - -def splitIdentifier(str): - ret =3D [] - while str !=3D "": - cur =3D string.lower(str[0]) - str =3D str[1:] - if ((cur < 'a') or (cur > 'z')): - continue - while (str !=3D "") and (str[0] >=3D 'A') and (str[0] <=3D 'Z'): - cur =3D cur + string.lower(str[0]) - str =3D str[1:] - while (str !=3D "") and (str[0] >=3D 'a') and (str[0] <=3D 'z'): - cur =3D cur + str[0] - str =3D str[1:] - while (str !=3D "") and (str[0] >=3D '0') and (str[0] <=3D '9'): - str =3D str[1:] - ret.append(cur) - return ret - -def addWord(word, module, symbol, relevance): - global wordsDict - - if word is None or len(word) < 3: - return -1 - if module is None or symbol is None: - return -1 - if dropWords.has_key(word): - return 0 - if ord(word[0]) > 0x80: - return 0 - - if wordsDict.has_key(word): - d =3D wordsDict[word] - if d is None: - return 0 - if len(d) > 500: - wordsDict[word] =3D None - return 0 - try: - relevance =3D relevance + d[(module, symbol)] - except: - pass - else: - wordsDict[word] =3D {} - wordsDict[word][(module, symbol)] =3D relevance - return relevance - -def addString(str, module, symbol, relevance): - if str is None or len(str) < 3: - return -1 - ret =3D 0 - str =3D cleanupWordsString(str) - l =3D string.split(str) - for word in l: - if len(word) > 2: - ret =3D ret + addWord(word, module, symbol, 5) - - return ret - -def addWordHTML(word, resource, id, section, relevance): - global wordsDictHTML - - if word is None or len(word) < 3: - return -1 - if resource is None or section is None: - return -1 - if dropWords.has_key(word): - return 0 - if ord(word[0]) > 0x80: - return 0 - - section =3D cleanupDescrString(section) - - if wordsDictHTML.has_key(word): - d =3D wordsDictHTML[word] - if d is None: - print "skipped %s" % (word) - return 0 - try: - (r,i,s) =3D d[resource] - if i is not None: - id =3D i - if s is not None: - section =3D s - relevance =3D relevance + r - except: - pass - else: - wordsDictHTML[word] =3D {} - d =3D wordsDictHTML[word] - d[resource] =3D (relevance, id, section) - return relevance - -def addStringHTML(str, resource, id, section, relevance): - if str is None or len(str) < 3: - return -1 - ret =3D 0 - str =3D cleanupWordsString(str) - l =3D string.split(str) - for word in l: - if len(word) > 2: - try: - r =3D addWordHTML(word, resource, id, section, relevance) - if r < 0: - print "addWordHTML failed: %s %s" % (word, resource) - ret =3D ret + r - except: - print "addWordHTML failed: %s %s %d" % (word, resource, re= levance) - print sys.exc_type, sys.exc_value - - return ret - -def addWordArchive(word, id, relevance): - global wordsDictArchive - - if word is None or len(word) < 3: - return -1 - if id is None or id =3D=3D -1: - return -1 - if dropWords.has_key(word): - return 0 - if ord(word[0]) > 0x80: - return 0 - - if wordsDictArchive.has_key(word): - d =3D wordsDictArchive[word] - if d is None: - print "skipped %s" % (word) - return 0 - try: - r =3D d[id] - relevance =3D relevance + r - except: - pass - else: - wordsDictArchive[word] =3D {} - d =3D wordsDictArchive[word] - d[id] =3D relevance - return relevance - -def addStringArchive(str, id, relevance): - if str is None or len(str) < 3: - return -1 - ret =3D 0 - str =3D cleanupWordsString(str) - l =3D string.split(str) - for word in l: - i =3D len(word) - if i > 2: - try: - r =3D addWordArchive(word, id, relevance) - if r < 0: - print "addWordArchive failed: %s %s" % (word, id) - else: - ret =3D ret + r - except: - print "addWordArchive failed: %s %s %d" % (word, id, relev= ance) - print sys.exc_type, sys.exc_value - return ret - -######################################################################### -# # -# XML API description analysis # -# # -######################################################################### - -def loadAPI(filename): - doc =3D libxml2.parseFile(filename) - print "loaded %s" % (filename) - return doc - -def foundExport(file, symbol): - if file is None: - return 0 - if symbol is None: - return 0 - addFunction(symbol, file) - l =3D splitIdentifier(symbol) - for word in l: - addWord(word, file, symbol, 10) - return 1 - -def analyzeAPIFile(top): - count =3D 0 - name =3D top.prop("name") - cur =3D top.children - while cur is not None: - if cur.type =3D=3D 'text': - cur =3D cur.next - continue - if cur.name =3D=3D "exports": - count =3D count + foundExport(name, cur.prop("symbol")) - else: - print "unexpected element %s in API doc " % = (name) - cur =3D cur.next - return count - -def analyzeAPIFiles(top): - count =3D 0 - cur =3D top.children - - while cur is not None: - if cur.type =3D=3D 'text': - cur =3D cur.next - continue - if cur.name =3D=3D "file": - count =3D count + analyzeAPIFile(cur) - else: - print "unexpected element %s in API doc " % (cur.name) - cur =3D cur.next - return count - -def analyzeAPIEnum(top): - file =3D top.prop("file") - if file is None: - return 0 - symbol =3D top.prop("name") - if symbol is None: - return 0 - - addEnum(symbol, file) - l =3D splitIdentifier(symbol) - for word in l: - addWord(word, file, symbol, 10) - - return 1 - -def analyzeAPIConst(top): - file =3D top.prop("file") - if file is None: - return 0 - symbol =3D top.prop("name") - if symbol is None: - return 0 - - addConst(symbol, file) - l =3D splitIdentifier(symbol) - for word in l: - addWord(word, file, symbol, 10) - - return 1 - -def analyzeAPIType(top): - file =3D top.prop("file") - if file is None: - return 0 - symbol =3D top.prop("name") - if symbol is None: - return 0 - - addType(symbol, file) - l =3D splitIdentifier(symbol) - for word in l: - addWord(word, file, symbol, 10) - return 1 - -def analyzeAPIFunctype(top): - file =3D top.prop("file") - if file is None: - return 0 - symbol =3D top.prop("name") - if symbol is None: - return 0 - - addFunctype(symbol, file) - l =3D splitIdentifier(symbol) - for word in l: - addWord(word, file, symbol, 10) - return 1 - -def analyzeAPIStruct(top): - file =3D top.prop("file") - if file is None: - return 0 - symbol =3D top.prop("name") - if symbol is None: - return 0 - - addStruct(symbol, file) - l =3D splitIdentifier(symbol) - for word in l: - addWord(word, file, symbol, 10) - - info =3D top.prop("info") - if info is not None: - info =3D string.replace(info, "'", " ") - info =3D string.strip(info) - l =3D string.split(info) - for word in l: - if len(word) > 2: - addWord(word, file, symbol, 5) - return 1 - -def analyzeAPIMacro(top): - file =3D top.prop("file") - if file is None: - return 0 - symbol =3D top.prop("name") - if symbol is None: - return 0 - symbol =3D string.replace(symbol, "'", " ") - symbol =3D string.strip(symbol) - - info =3D None - cur =3D top.children - while cur is not None: - if cur.type =3D=3D 'text': - cur =3D cur.next - continue - if cur.name =3D=3D "info": - info =3D cur.content - break - cur =3D cur.next - - l =3D splitIdentifier(symbol) - for word in l: - addWord(word, file, symbol, 10) - - if info is None: - addMacro(symbol, file) - print "Macro %s description has no " % (symbol) - return 0 - - info =3D string.replace(info, "'", " ") - info =3D string.strip(info) - addMacro(symbol, file, info) - l =3D string.split(info) - for word in l: - if len(word) > 2: - addWord(word, file, symbol, 5) - return 1 - -def analyzeAPIFunction(top): - file =3D top.prop("file") - if file is None: - return 0 - symbol =3D top.prop("name") - if symbol is None: - return 0 - - symbol =3D string.replace(symbol, "'", " ") - symbol =3D string.strip(symbol) - info =3D None - cur =3D top.children - while cur is not None: - if cur.type =3D=3D 'text': - cur =3D cur.next - continue - if cur.name =3D=3D "info": - info =3D cur.content - elif cur.name =3D=3D "return": - rinfo =3D cur.prop("info") - if rinfo is not None: - rinfo =3D string.replace(rinfo, "'", " ") - rinfo =3D string.strip(rinfo) - addString(rinfo, file, symbol, 7) - elif cur.name =3D=3D "arg": - ainfo =3D cur.prop("info") - if ainfo is not None: - ainfo =3D string.replace(ainfo, "'", " ") - ainfo =3D string.strip(ainfo) - addString(ainfo, file, symbol, 5) - name =3D cur.prop("name") - if name is not None: - name =3D string.replace(name, "'", " ") - name =3D string.strip(name) - addWord(name, file, symbol, 7) - cur =3D cur.next - if info is None: - print "Function %s description has no " % (symbol) - addFunction(symbol, file, "") - else: - info =3D string.replace(info, "'", " ") - info =3D string.strip(info) - addFunction(symbol, file, info) - addString(info, file, symbol, 5) - - l =3D splitIdentifier(symbol) - for word in l: - addWord(word, file, symbol, 10) - - return 1 - -def analyzeAPISymbols(top): - count =3D 0 - cur =3D top.children - - while cur is not None: - if cur.type =3D=3D 'text': - cur =3D cur.next - continue - if cur.name =3D=3D "macro": - count =3D count + analyzeAPIMacro(cur) - elif cur.name =3D=3D "function": - count =3D count + analyzeAPIFunction(cur) - elif cur.name =3D=3D "const": - count =3D count + analyzeAPIConst(cur) - elif cur.name =3D=3D "typedef": - count =3D count + analyzeAPIType(cur) - elif cur.name =3D=3D "struct": - count =3D count + analyzeAPIStruct(cur) - elif cur.name =3D=3D "enum": - count =3D count + analyzeAPIEnum(cur) - elif cur.name =3D=3D "functype": - count =3D count + analyzeAPIFunctype(cur) - else: - print "unexpected element %s in API doc " % (cur.name) - cur =3D cur.next - return count - -def analyzeAPI(doc): - count =3D 0 - if doc is None: - return -1 - root =3D doc.getRootElement() - if root.name !=3D "api": - print "Unexpected root name" - return -1 - cur =3D root.children - while cur is not None: - if cur.type =3D=3D 'text': - cur =3D cur.next - continue - if cur.name =3D=3D "files": - pass -# count =3D count + analyzeAPIFiles(cur) - elif cur.name =3D=3D "symbols": - count =3D count + analyzeAPISymbols(cur) - else: - print "unexpected element %s in API doc" % (cur.name) - cur =3D cur.next - return count - -######################################################################### -# # -# Web pages parsing and analysis # -# # -######################################################################### - -import glob - -def analyzeHTMLText(doc, resource, p, section, id): - words =3D 0 - try: - content =3D p.content - words =3D words + addStringHTML(content, resource, id, section, 5) - except: - return -1 - return words - -def analyzeHTMLPara(doc, resource, p, section, id): - words =3D 0 - try: - content =3D p.content - words =3D words + addStringHTML(content, resource, id, section, 5) - except: - return -1 - return words - -def analyzeHTMLPre(doc, resource, p, section, id): - words =3D 0 - try: - content =3D p.content - words =3D words + addStringHTML(content, resource, id, section, 5) - except: - return -1 - return words - -def analyzeHTML(doc, resource, p, section, id): - words =3D 0 - try: - content =3D p.content - words =3D words + addStringHTML(content, resource, id, section, 5) - except: - return -1 - return words - -def analyzeHTML(doc, resource): - para =3D 0 - ctxt =3D doc.xpathNewContext() - try: - res =3D ctxt.xpathEval("//head/title") - title =3D res[0].content - except: - title =3D "Page %s" % (resource) - addPage(resource, title) - try: - items =3D ctxt.xpathEval("//h1 | //h2 | //h3 | //text()") - section =3D title - id =3D "" - for item in items: - if item.name =3D=3D 'h1' or item.name =3D=3D 'h2' or item.name= =3D=3D 'h3': - section =3D item.content - if item.prop("id"): - id =3D item.prop("id") - elif item.prop("name"): - id =3D item.prop("name") - elif item.type =3D=3D 'text': - analyzeHTMLText(doc, resource, item, section, id) - para =3D para + 1 - elif item.name =3D=3D 'p': - analyzeHTMLPara(doc, resource, item, section, id) - para =3D para + 1 - elif item.name =3D=3D 'pre': - analyzeHTMLPre(doc, resource, item, section, id) - para =3D para + 1 - else: - print "Page %s, unexpected %s element" % (resource, item.n= ame) - except: - print "Page %s: problem analyzing" % (resource) - print sys.exc_type, sys.exc_value - - return para - -def analyzeHTMLPages(): - ret =3D 0 - HTMLfiles =3D glob.glob("*.html") + glob.glob("tutorial/*.html") + \ - glob.glob("CIM/*.html") + glob.glob("ocaml/*.html") + \ - glob.glob("ruby/*.html") - for html in HTMLfiles: - if html[0:3] =3D=3D "API": - continue - if html =3D=3D "xml.html": - continue - try: - doc =3D libxml2.parseFile(html) - except: - doc =3D libxml2.htmlParseFile(html, None) - try: - res =3D analyzeHTML(doc, html) - print "Parsed %s: %d paragraphs" % (html, res) - ret =3D ret + 1 - except: - print "could not parse %s" % (html) - return ret - -######################################################################### -# # -# Mail archives parsing and analysis # -# # -######################################################################### - -import time - -def getXMLDateArchive(t=3DNone): - if t is None: - t =3D time.time() - T =3D time.gmtime(t) - month =3D time.strftime("%B", T) - year =3D T[0] - url =3D "http://www.redhat.com/archives/libvir-list/%d-%s/date.html" %= (year, month) - return url - -def scanXMLMsgArchive(url, title, force=3D0): - if url is None or title is None: - return 0 - - ID =3D checkXMLMsgArchive(url) - if force =3D=3D 0 and ID !=3D -1: - return 0 - - if ID =3D=3D -1: - ID =3D addXMLMsgArchive(url, title) - if ID =3D=3D -1: - return 0 - - try: - print "Loading %s" % (url) - doc =3D libxml2.htmlParseFile(url, None) - except: - doc =3D None - if doc is None: - print "Failed to parse %s" % (url) - return 0 - - addStringArchive(title, ID, 20) - ctxt =3D doc.xpathNewContext() - texts =3D ctxt.xpathEval("//pre//text()") - for text in texts: - addStringArchive(text.content, ID, 5) - - return 1 - -def scanXMLDateArchive(t=3DNone, force=3D0): - global wordsDictArchive - - wordsDictArchive =3D {} - - url =3D getXMLDateArchive(t) - print "loading %s" % (url) - try: - doc =3D libxml2.htmlParseFile(url, None) - except: - doc =3D None - if doc is None: - print "Failed to parse %s" % (url) - return -1 - ctxt =3D doc.xpathNewContext() - anchors =3D ctxt.xpathEval("//a[@href]") - links =3D 0 - newmsg =3D 0 - for anchor in anchors: - href =3D anchor.prop("href") - if href is None or href[0:3] !=3D "msg": - continue - try: - links =3D links + 1 - - msg =3D libxml2.buildURI(href, url) - title =3D anchor.content - if title is not None and title[0:4] =3D=3D 'Re: ': - title =3D title[4:] - if title is not None and title[0:6] =3D=3D '[xml] ': - title =3D title[6:] - newmsg =3D newmsg + scanXMLMsgArchive(msg, title, force) - - except: - pass - - return newmsg - - -######################################################################### -# # -# Main code: open the DB, the API XML and analyze it # -# # -######################################################################### -def analyzeArchives(t=3DNone, force=3D0): - global wordsDictArchive - - ret =3D scanXMLDateArchive(t, force) - print "Indexed %d words in %d archive pages" % (len(wordsDictArchive),= ret) - - i =3D 0 - skipped =3D 0 - for word in wordsDictArchive.keys(): - refs =3D wordsDictArchive[word] - if refs is None: - skipped =3D skipped + 1 - continue - for id in refs.keys(): - relevance =3D refs[id] - updateWordArchive(word, id, relevance) - i =3D i + 1 - - print "Found %d associations in HTML pages" % (i) - -def analyzeHTMLTop(): - global wordsDictHTML - - ret =3D analyzeHTMLPages() - print "Indexed %d words in %d HTML pages" % (len(wordsDictHTML), ret) - - i =3D 0 - skipped =3D 0 - for word in wordsDictHTML.keys(): - refs =3D wordsDictHTML[word] - if refs is None: - skipped =3D skipped + 1 - continue - for resource in refs.keys(): - (relevance, id, section) =3D refs[resource] - updateWordHTML(word, resource, section, id, relevance) - i =3D i + 1 - - print "Found %d associations in HTML pages" % (i) - -def analyzeAPITop(): - global wordsDict - global API - - try: - doc =3D loadAPI(API) - ret =3D analyzeAPI(doc) - print "Analyzed %d blocs" % (ret) - doc.freeDoc() - except: - print "Failed to parse and analyze %s" % (API) - print sys.exc_type, sys.exc_value - sys.exit(1) - - print "Indexed %d words" % (len(wordsDict)) - i =3D 0 - skipped =3D 0 - for word in wordsDict.keys(): - refs =3D wordsDict[word] - if refs is None: - skipped =3D skipped + 1 - continue - for (module, symbol) in refs.keys(): - updateWord(word, symbol, refs[(module, symbol)]) - i =3D i + 1 - - print "Found %d associations, skipped %d words" % (i, skipped) - -def usage(): - print "Usage index.py [--force] [--archive] [--archive-year year] [--= archive-month month] [--API] [--docs]" - sys.exit(1) - -def main(): - try: - openMySQL() - except: - print "Failed to open the database" - print sys.exc_type, sys.exc_value - sys.exit(1) - - args =3D sys.argv[1:] - force =3D 0 - if args: - i =3D 0 - while i < len(args): - if args[i] =3D=3D '--force': - force =3D 1 - elif args[i] =3D=3D '--archive': - analyzeArchives(None, force) - elif args[i] =3D=3D '--archive-year': - i =3D i + 1 - year =3D args[i] - months =3D ["January", "February", "March", "April", "May", - "June", "July", "August", "September", "October", - "November", "December"] - for month in months: - try: - str =3D "%s-%s" % (year, month) - T =3D time.strptime(str, "%Y-%B") - t =3D time.mktime(T) + 3600 * 24 * 10 - analyzeArchives(t, force) - except: - print "Failed to index month archive:" - print sys.exc_type, sys.exc_value - elif args[i] =3D=3D '--archive-month': - i =3D i + 1 - month =3D args[i] - try: - T =3D time.strptime(month, "%Y-%B") - t =3D time.mktime(T) + 3600 * 24 * 10 - analyzeArchives(t, force) - except: - print "Failed to index month archive:" - print sys.exc_type, sys.exc_value - elif args[i] =3D=3D '--API': - analyzeAPITop() - elif args[i] =3D=3D '--docs': - analyzeHTMLTop() - else: - usage() - i =3D i + 1 - else: - usage() - -if __name__ =3D=3D "__main__": - main() --=20 2.21.0 -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list