Adam,
The problem has occurred again, so this time I took notes on the way through. In the past, I have tried re-indexing using the admin client and was not successful. However, this time, after a re-index, things seem to be restored properly. That is hopeful. Either way, I am providing the additional information here in case any other good plans (re-index takes a long time) can come to light, or there is a code fix that could be done.
I believe the cause of the problem is shutting down the server whenever Lucene is indexing something, which is alot.
Version is eXist 1.4.x, running in Tomcat 7.0. I have seen this in Jetty too, but the logs below come from Tomcat.
I am not sure about removing the file either--but the database will not come back up until all the lock files are removed (data/*.lock) and (data/lucene/write.lock)
Below, I have included a bunch of logging information that might help!
===============================
1. If the lock files have not been removed, the database instance will not become available.
File lock last access timestamp: Jun 11, 2011 C:\Program Files\Apache Software Foundation\Tomcat 7.0\webapps\eXist\WEB-INF\data\journal.lck
Found a stale lockfile. Trying to remove it: C:\Program Files\Apache Software Foundation\Tomcat 7.0\webapps\eXist\WEB-INF\data\journal.lck
AND....
SEVERE: Servlet /eXist threw load() exception
javax.servlet.ServletException: No database instance available
at org.exist.http.servlets.EXistServlet.init(EXistServlet.java:140)
at org.apache.catalina.core.StandardWrapper.initServlet(StandardWrapper.java:1189)
at org.apache.catalina.core.StandardWrapper.loadServlet(StandardWrapper.java:1103)
at org.apache.catalina.core.StandardWrapper.load(StandardWrapper.java:1010)
at org.apache.catalina.core.StandardContext.loadOnStartup(StandardContext.java:4935)
at org.apache.catalina.core.StandardContext$3.call(StandardContext.java:5262)
at org.apache.catalina.core.StandardContext$3.call(StandardContext.java:5257)
at java.util.concurrent.FutureTask$Sync.innerRun(Unknown Source)
at java.util.concurrent.FutureTask.run(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
at java.lang.Thread.run(Unknown Source)
=====================================
2. the lucene/write.lock seems to be cause the most trouble:
2011-06-11 11:36:39,890 [Thread-2] ERROR (BrokerPool.java [configure]:238) - Unable to initialialize database instance 'exist': Exception while reading lucene index directory: Lock obtain timed out: ***@C:\Program Files\Apache Software Foundation\Tomcat 7.0\webapps\eXist\WEB-INF\data\lucene\write.lock
org.exist.util.DatabaseConfigurationException: Exception while reading lucene index directory: Lock obtain timed out: ***@C:\Program Files\Apache Software Foundation\Tomcat 7.0\webapps\eXist\WEB-INF\data\lucene\write.lock
at org.exist.indexing.lucene.LuceneIndex.open(LuceneIndex.java:89)
at org.exist.indexing.IndexManager.<init>(IndexManager.java:86)
at org.exist.storage.BrokerPool.initialize(BrokerPool.java:771)
at org.exist.storage.BrokerPool.<init>(BrokerPool.java:654)
at org.exist.storage.BrokerPool.configure(BrokerPool.java:216)
at org.exist.storage.BrokerPool.configure(BrokerPool.java:188)
at org.exist.http.servlets.EXistServlet.startup(EXistServlet.java:672)
at org.exist.http.servlets.EXistServlet.init(EXistServlet.java:113)
at org.apache.catalina.core.StandardWrapper.initServlet(StandardWrapper.java:1189)
at org.apache.catalina.core.StandardWrapper.loadServlet(StandardWrapper.java:1103)
at org.apache.catalina.core.StandardWrapper.load(StandardWrapper.java:1010)
at org.apache.catalina.core.StandardContext.loadOnStartup(StandardContext.java:4935)
at org.apache.catalina.core.StandardContext$3.call(StandardContext.java:5262)
at org.apache.catalina.core.StandardContext$3.call(StandardContext.java:5257)
at java.util.concurrent.FutureTask$Sync.innerRun(Unknown Source)
at java.util.concurrent.FutureTask.run(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
at java.lang.Thread.run(Unknown Source)
Caused by: org.apache.lucene.store.LockObtainFailedException: Lock obtain timed out: ***@C:\Program Files\Apache Software Foundation\Tomcat 7.0\webapps\eXist\WEB-INF\data\lucene\write.lock
at org.apache.lucene.store.Lock.obtain(Lock.java:85)
at org.apache.lucene.index.IndexWriter.init(IndexWriter.java:1550)
at org.apache.lucene.index.IndexWriter.init(IndexWriter.java:1525)
at org.apache.lucene.index.IndexWriter.<init>(IndexWriter.java:1290)
at org.exist.indexing.lucene.LuceneIndex.getWriter(LuceneIndex.java:144)
at org.exist.indexing.lucene.LuceneIndex.open(LuceneIndex.java:87)
... 18 more
===================================
3. Removing just the data/*.lock files still leaves the database down, on the lucene/write.lock exception (see #2).
===================================
4. Removing the data/lucene/write.lock lets the database come back up, but it comes up in a very unhappy place.
The application works, queries work, etc... however, ANY query that relies on an index (lucene or normal <qname>) fail to execute! In fact, from the recent testing, even simple queries using collection()/xpath fail too.
These queries just return nothing, but also do not throw any errors to the logs.
===================================
5. In this case, a re-index has seemed to solve the issue. The database is back up and working again. I have definitely tried this previously without positive results, and have been forced to revert to a restore point.
Thanks for any help,
Lothy