====== Data in database mnogosearch ====== d...@mapluz.fr Wed, 30 Oct 2013 02:09:39 -0700 hi, I'm using mnogosearch 3.2 on linux ubuntu server. i have created a mysql database and my indexer.conf file is here : http://www.mapluz.fr/public/indexer.conf i init indexer with this command : /usr/local/mnogosearch/sbin/indexer -Eblob /usr/local/mnogosearch/etc/indexer.conf i run indexer with this command : /usr/local/mnogosearch/sbin/indexer -d /usr/local/mnogosearch/etc/indexerer.conf my search with the sample of API php (the search.php sample provides by mnogosearch) return no results. so, perhaps my database have a problem : i have a question about the bdict table; here is an example : http://www.mapluz.fr/public/capture.jpg why is it so small bdict table ? Alexander Barkov Wed, 30 Oct 2013 07:54:07 -0700 Hi, Hmm, 3.2 sounds very old. Why not 3.3? They are to be run in the opposite order: 1. Crawl documents: /usr/local/mnogosearch/sbin/indexer -d /usr/local/mnogosearch/etc/indexerer.conf 2. Index the documents collected by crawler: /usr/local/mnogosearch/sbin/indexer -Eblob /usr/local/mnogosearch/etc/indexer.conf Try to run this command again: /usr/local/mnogosearch/sbin/indexer -Eblob /usr/local/mnogosearch/etc/indexer.conf Does the size of the table "bdict" change? ====== wordstat ====== **Question** On 12/31/2013 01:35 PM, Developpement Team Hodei wrote: hi, i'm using the mnogosearch 3.2 on an uguntu server. i index with DBMODE=BLOB i want use autocompletion in my input textbox when a user search on the web : so where are the words that mnogosearch has tagged it ? my wrdstat table is empty but it seems the is word in bdict **Response** Run "indexer --wordstat" to populate the table. Btw, which tools are you going to use to implement autocompletion? It would be nice to make autocompletion available out of the box into the next development branch (3.4.x). ====== Configure indexer.conf to disable RSS flux ====== Hi, On 01/27/2014 01:04 PM, Developpement Team Hodei wrote: hi, configured my indexer.conf file for web search to disable RSS flux. i have edit this : Disallow *.xml but it do not work : when the users search on the our web site, ant take rss string in the textbox search, all Rss flux appears Have you an idea ? De: "Alexander Barkov" On 01/27/2014 03:44 PM, Developpement Team Hodei wrote: yes my diallow command are before allow command but the site that mnogosearch find is effectivly not xml file but flux with feed directory : the adress web of flux are : http://www.graindesoleil.fr/category/sans-gluten/recettes-sans-gluten/feed/ http://www.hotel-restaurant-euzkadi.com/en/lequipe/feed/ http://bidarteko.com/2013/01/25/urumea-surf/feed/ maybe there was something I did not understand RSS You need to do something like this: Disallow http://www.graindesoleil.fr/category/sans-gluten/recettes-sans-gluten/feed/* ====== indexer make segmentation fail ====== **dev@hodei.net 17/03/2014 19:23** Hi I try this command to Indexing my list of web site : /usr/local/mnogosearch/sbin/indexer -Eblob -d /usr/local/mnogosearch/etc/indexer.conf the result is : Segmentation fault here my sql databases informations : +------------+-----------+ | Tables | Size (MB) | +------------+-----------+ | bdict | 191.90 | | bdicti | 605.58 | | categories | 0.00 | | crossdict | 0.00 | | dict | 0.00 | | links | 0.00 | | qcache | 0.00 | | qinfo | 0.00 | | qtrack | 0.00 | | server | 0.18 | | srvinfo | 0.00 | | url | 3028.87 | | urlinfo | 2110.99 | | wrdstat | 0.00 | +------------+-----------+ have you an idea? __________________________________________________________________ my config : * Debian 3.2.51-1 x86_64 GNU/Linux * mnogosearch 3.3.15 * indexer.conf : ...... DBAddr mysql://root:password@localhost/mnogosearch/?dbmode=blob ...... * commande ./configure for installation ../configure --prefix=/usr/local/mnogosearch --bindir=/usr/local/mnogosearch/bin --sbindir=/usr/local/mnogosearch/sbin --sysconfdir=/usr/local/mnogosearch/etc --localstatedir=/usr/local/mnogosearch/var --libdir=/usr/local/mnogosearch/lib --includedir=/usr/local/mnogosearch/include --mandir=/usr/local/mnogosearch/man --disable-shared --enable-static --enable-syslog --without-docs --enable-pthreads --disable-dmalloc --enable-parser --disable-mp3 --disable-xml --disable-rss --disable-css --disable-js --with-extra-charsets=all --enable-file --enable-http --enable-ftp --enable-htdb --enable-news --with-mysql --with-zlib **bar@udm.net 18/03/2014 03:12** Hi, Please try to get gdb backtrace. ====== accented characters ====== **dev@hodei.net 05/05/2014 12:32** hi i have accented characters in my web search. to solve this problem, i have modify the database with this queries : ALTER TABLE bdict CONVERT TO CHARACTER SET utf8 COLLATE utf8_general_ci; ALTER TABLE bdicti CONVERT TO CHARACTER SET utf8 COLLATE utf8_general_ci; and i have init variables $localcharset et $browsercharset with utf-8 in my indexer.conf But i have always the problem ! have you an idea ? Thanks __________________________________________________________________ my config : * Debian 3.2.51-1 x86_64 GNU/Linux * mnogosearch 3.3.15 * indexer.conf : ...... DBAddr mysql://root:password@localhost/mnogosearch/?dbmode=blob ...... Pas de réponse ====== Delete a line in Server method in indexer.conf ====== **dev@hodei.net 05/05/2014 12:26** Hi, In the indexer.conf file, in 'Server [Method] ', i want to delete an entry like this : 'server http://www.eke.org' Is that all pages of the site will be removed from the database during the next crawling? Thanks __________________________________________________________________ my config : * Debian 3.2.51-1 x86_64 GNU/Linux * mnogosearch 3.3.15 * indexer.conf : ...... DBAddr mysql://root:password@localhost/mnogosearch/?dbmode=blob ...... Pas de réponse ====== delete server url in indexer.conf====== **dev@hodei.net 01/04/2014 19:11** Hi, In the indexer.conf file in 'Server [Method] ', i want delete an entry like 'server http://www.eke.org' do I need to clear the base MyssQL data after change Thanks Pas de réponse ====== Duplicates Commandes in indexer ====== **dev@hodei.net 03/04/2014 12:49** **dev@hodei.net 07/08/2014 11:30** **dev@hodei.net 06/10/2014 11:23** Hi When i try to execute the indexer command it run command in double; for example : /usr/local/mnogosearch/sbin/indexer -Ecreate -d /usr/local/mnogosearch/etc/indexer.conf : create tables twice and in the second run i have a warning 'table already exist' /usr/local/mnogosearch/sbin/indexer -Eblob /usr/local/mnogosearch/etc/indexer.conf : ------------------------------------------------------------- indexer[16663]: Indexing indexer[16663]: Loading URL list indexer[16663]: Converting intag00 indexer[16663]: Converting intag01 indexer[16663]: Converting intag02 indexer[16663]: Converting intag03 indexer[16663]: Converting intag04 indexer[16663]: Converting intag05 indexer[16663]: Converting intag06 indexer[16663]: Converting intag07 indexer[16663]: Converting intag08 indexer[16663]: Converting intag09 indexer[16663]: Converting intag0A indexer[16663]: Converting intag0B indexer[16663]: Converting intag0C indexer[16663]: Converting intag0D indexer[16663]: Converting intag0E indexer[16663]: Converting intag0F indexer[16663]: Converting intag10 indexer[16663]: Converting intag11 indexer[16663]: Converting intag12 indexer[16663]: Converting intag13 indexer[16663]: Converting intag14 indexer[16663]: Converting intag15 indexer[16663]: Converting intag16 indexer[16663]: Converting intag17 indexer[16663]: Converting intag18 indexer[16663]: Converting intag19 indexer[16663]: Converting intag1A indexer[16663]: Converting intag1B indexer[16663]: Converting intag1C indexer[16663]: Converting intag1D indexer[16663]: Converting intag1E indexer[16663]: Converting intag1F indexer[16663]: Total converted: 2604877 records, 13711786 bytes indexer[16663]: Converting url data indexer[16663]: Switching to new blob table. indexer[16663]: Loading URL list indexer[16663]: Converting intag00 indexer[16663]: Converting intag01 indexer[16663]: Converting intag02 indexer[16663]: Converting intag03 indexer[16663]: Converting intag04 indexer[16663]: Converting intag05 indexer[16663]: Converting intag06 indexer[16663]: Converting intag07 indexer[16663]: Converting intag08 indexer[16663]: Converting intag09 indexer[16663]: Converting intag0A indexer[16663]: Converting intag0B indexer[16663]: Converting intag0C indexer[16663]: Converting intag0D indexer[16663]: Converting intag0E indexer[16663]: Converting intag0F indexer[16663]: Converting intag10 indexer[16663]: Converting intag11 indexer[16663]: Converting intag12 indexer[16663]: Converting intag13 indexer[16663]: Converting intag14 indexer[16663]: Converting intag15 indexer[16663]: Converting intag16 indexer[16663]: Converting intag17 indexer[16663]: Converting intag18 indexer[16663]: Converting intag19 indexer[16663]: Converting intag1A indexer[16663]: Converting intag1B indexer[16663]: Converting intag1C indexer[16663]: Converting intag1D indexer[16663]: Converting intag1E indexer[16663]: Converting intag1F indexer[16663]: Total converted: 2605019 records, 13712168 bytes indexer[16663]: Converting url data indexer[16663]: Switching to new blob table. ------------------------------------------------------------- I have configure indexer like this : usr/local/mnogosearch/lib --includedir=/usr/local/mnogosearch/include --mandir=/usr/local/mnogosearch/man --disable-shared --enable-static --enable-syslog --without-docs --enable-pthreads --disable-dmalloc --enable-parser --disable-mp3 --disable-xml --disable-rss --disable-css --disable-js --with-extra-charsets=all --enable-file --enable-http --enable-ftp --enable-htdb --enable-news --with-mysql --with-zlib Here my config : _________________________________________________________________ my config : * Debian 3.2.51-1 x86_64 GNU/Linux * mnogosearch 3.3.15 * contents of indexer.conf : ...... DBAddr mysql://root:password@localhost/mnogosearch/?dbmode=blob ...... _________________________________________________________________ Have you an idea ? Thanks **bar@udm.net 06/10/2014 11:47** Hi, Most likely you have two DBAddr commands in your indexer.conf. If this does not help, please send me your indexer.conf. ====== Problem with indexer -Eblob ====== **13/03/2014 17:51** hi when i try this command, i have an error message : mysql_stmt_execute() failed: Lost connection to MySQL server during query here is the result of my command : ------------------------------------------------------------------------------------------------------------------------------------------------------ root@bot:/usr/local/mnogosearch/sbin# ./indexer -Eblob /usr/local/mnogosearch/etc/indexer.conf indexer[22039]: Indexing indexer[22039]: Loading URL list indexer[22039]: Converting intag00 indexer[22039]: Converting intag01 indexer[22039]: Converting intag02 indexer[22039]: Converting intag03 indexer[22039]: Converting intag04 indexer[22039]: Converting intag05 indexer[22039]: Converting intag06 indexer[22039]: Converting intag07 indexer[22039]: Converting intag08 indexer[22039]: Converting intag09 indexer[22039]: Converting intag0A indexer[22039]: Converting intag0B indexer[22039]: Converting intag0C indexer[22039]: Converting intag0D indexer[22039]: Converting intag0E indexer[22039]: Converting intag0F indexer[22039]: Converting intag10 indexer[22039]: Converting intag11 indexer[22039]: Converting intag12 indexer[22039]: Converting intag13 indexer[22039]: Converting intag14 indexer[22039]: Converting intag15 indexer[22039]: Converting intag16 indexer[22039]: Converting intag17 indexer[22039]: Converting intag18 indexer[22039]: Converting intag19 indexer[22039]: Converting intag1A indexer[22039]: Converting intag1B indexer[22039]: Converting intag1C indexer[22039]: Converting intag1D indexer[22039]: Converting intag1E indexer[22039]: Converting intag1F indexer[22039]: Total converted: 32554167 records, 78081117 bytes indexer[22039]: Converting url data indexer[22039]: mysql_stmt_execute() failed: Lost connection to MySQL server during query ------------------------------------------------------------------------------------------------------------------------------------------------------ have you an idea? -------------------- my config : Debian 3.2.51-1 x86_64 GNU/Linux mnogosearch 3.3.15 **bar@mnogosearch.org 18/03/2014 03:18** Try this: select @@max_allowed_packet; If the maximum packet size is small enough, try to increase it in the server side. If this does not help, then compile mnogoseearch again by adding --with-debug to the configure command line, then add "DebugSQL=yes" parameter into DBAddr, like this: DBAddr mysql://root@localhost/test/?DebugSQL=yes and run "indexer -Eblob 2>LOG.txt". It will print all SQL queries to the log file. Check the last few lines in the log. **dev@hodei.net 18/03/2014 12:07** Hi Thanks for your help If i take select @@max_allowed_packet; i have this result : +----------------------+ | @@max_allowed_packet | +----------------------+ | 16777216 | +----------------------+ 1 row in set (0.00 sec) So, after i have recompile mnogoseearch 3.3.15 by adding --with-debug i have run make and make install i have add DebugSQL=yes in indexer.conf at the end of line DBAddr mysql://root:mypasswd@localhost/test/?dbmode=blob&DebugSQL=yes and i have run indexer -Eblob 2>LOG.txt the message in LOG.txt is : ---------------------------------------------- indexer[13250]: Indexing indexer[13250]: Loading URL list ---------------------------------------------- the process terminated on the console by 'Killed' message i do not understand nothing ? is my databases corrupted ? can you help me ? Pas de réponse ====== parameter to server command in indexer.conf ====== **dev@hodei.net** 05/05/2014 15:02 hi When i try to add this url to my list in dexer.conf ------------------------------------------------------------------ server http://fr.wikipedia.org/wiki/Zanpantzar ------------------------------------------------------------------ the crawler search all fr.wikioedia.org site and not only Zanpantzar directory Have you an idea ? Thanks ______________________________________________________________ my config : * Debian 3.2.51-1 x86_64 GNU/Linux * mnogosearch 3.3.15 * indexer.conf : ...... DBAddr mysql://root:password@localhost/mnogosearch/?dbmode=blob ...... Pas de réponse ====== Crawling order ====== **dev@hodei.net 05/08/2014 12:12** Hi I have 1000 websites in my indexer.conf on the 'Server method' rubric in what order the 'crawler' look over the list of website : random, alphabetical or other **bar@mnogosearch.org 05/08/2014 18:36** Crawler selects targets in a random order. There are some related command line options: * -e Visit 'most expired' (oldest) documents first * -o Visit documents with less depth (hops value) first * -r Do not try to reduce remote servers load by randomising crawler queue order (faster, but less polite)