The Automated Reasoning area of CologNet was a sponsor of a joint panel at the
Second Workshop on the Role of Automated Deduction in Mathematics
(see http://www.dai.ed.ac.uk/homes/simonco/conferences/CADE02/ )
and the CADE-18 Workshop on Problems and Problem Sets
(see http://floc02.diku.dk/PaPS/ ). This was held alongside FLOC-02
(see http://floc02.diku.dk/)in Copenhagen on July 31st 2002. The panel
is on "Challenge Problems for Automated Deduction". Simon Colton kindly
wrote the following summary of the discussion.

Panel Members: Alan Bundy, Bill McCune, Peter Andrews, John Harrison
Chair: Simon Colton
 
Format: 10 minute opening remarks for each panel member, followed
by an hour of general discussion, open to the floor.
 
Initial Statements
------------------
 
Simon Colton opened the panel session by stating that - in his view -
there are two problems halting the uptake of automated reasoning
systems in mathematics: (i) the "low-hanging fruit problem", whereby
ATP research has found profitable areas in verification, and, even
though many people are interested in proving theorems from mainstream
maths, they are paid to produce verification proofs, and (ii) the "you
call that a theorem" problem, whereby many things that are counted as
theorems in automated reasoning (e.g., the associativity of plus) are
so trivially true to mathematicians, that the automated proof of these
theorems seems very unimpressive.

Peter Andrews said that major contributions to mathematics could be
made by ATP systems able to help referees of mathematical papers by
checking proofs.  One prerequisite for making such systems practical
is to develop a formal mathematics library on a large scale.  As part
of this project he suggested formalising the six books of Bourbaki's
"Elements of Mathematics". He expected that the input of the material
would start off in an interactive fashion, but hoped that it would
become increasingly automatic as the system learned to prove things
using the material already input.

Bill McCune followed with a discussion of the usage of theorem provers
as deductive support to creativity. He talked about his collaboration
with the mathematician Padmanabhan, in particular an example with
cancellative subgroups. Padmanabhan was turning to Otter to prove
theorems as part of the discovery process. However, it was noted that,
while Padmanabhan supplied the theorems, McCune employed the prover,
and Bill laid down the challenge of making powerful provers such as
Otter more interactive and easier to use.
 
Alan Bundy argued that proving ever more difficult theorems might not
be the best approach to an increased involvement of ATPs in
mathematics. He suggested that it would be better to find a place in
the work strategy of mathematicians, and his challenge was to increase the
uptake of theorem provers by mathematicians. Two possibilities for
this are to smuggle in provers via computer algebra systems and in
mathematics education. Bundy suggested that a good testbed is
Non-standard analysis, which members of his research team are working
on. The advantage to this is that it could help teach students the
notion of proof, perhaps without even knowing it.
 
John Harrison ended the opening remarks by talking about the challenge
of machine checking mathematics. He discussed the Mizar system
developed in Poland, noting that the declarative style it employs is
very easy to use. Mizar will tell you that the structure of a proof is
valid, even if it cannot justify single steps in it, and Mizar relies
on its existing library. He contrasted this with ATPs, which are very
weak by design, because they rarely re-use previously proved
assumptions.
 
Some Discussion Tracks Which Followed
-------------------------------------
 
Colton to McCune: How did Padmanabhan come on board?
 
McCune: He just contacted me. The problem was never
with the power of Otter, but with getting Padmanabhan to use it.
 
Andrews: Do you want to change his apathy towards using Otter?
 
Andrei Voronkov: Can you supply software for more general mathematics?
 
Bundy to Voronkov: We should apply systems to what we are
already good at. Do you ever use Vampire to prove any of your
research results?
 
Voronkov: But Vampire is not for mathematics, it is for combinatoric
problems only.
 
Bundy: Is there any hope for Vampire in maths?
 
Voronkov: Possibly....
 
-----------------
 
Colton: What about Zeilberger's claim that the only interesting maths
cannot be proved?
 
Everyone: general apathy towards that question: surely we shouldn't
take that seriously!
 
Colton: OK then.
 
-----------------
 
Audience member: If you're talking about mathematicians writing their
own provers, then this is very costly - they could just go out and
prove theorems.
 
Andrews: But science can't predict when particular work is going to
find an application. We should pursue advances wherever possible.
 
Bundy: But doesn't most of the work go into the application? There
is a huge amount of work on verification.
 
Harrison: And verification really does rely on mathematics - it is
very useful to have maths libraries available.
 
------------------
 
John Slaney: We shouldn't be too defensive about our achievements,
there's nothing wrong with low-hanging fruit.
 
Bundy: Automated verification of proofs is a worthy overall goal,
but a very long term goal, and we don't want to go into that
prematurely.
 
Slaney: Well, I've verified some papers for publication.
 
Bundy: Then can we offer a service for a particular journal?
 
Harrison: Or maybe it would be better to trawl some maths databases
looking for large numbers of theorems which are easy enough to
verify.
 
Chris Benzmuller: If we are checking proofs, then shouldn't we be
discussing HCI issues, and looking at data on the design of maths
GUIs.
 
Bundy to Jeremy Gow: Isn't it a question of getting the question
right: have you thought about a big empirical study?
 
Gow: No, for the current project, we're not planning a large
empirical study, but we would hopefully perform one in future.
 
---------------------
 
Voronkov: We should be thinking about ways to enforce software
on people. The microsoft way has been tried, and the Latex model
(Knuth/Lamport) is another model.
 
Harrison: Well, I was advocating the ideas behind Mizar, not
necessarily that we all go out and become big users.
 
Bundy: If I may repeat what I said earlier about education: this
could be a good way to enforce the use of ATPs.
 
----------------------
 
Volker Sorge to Panel: In your view, what is wrong with ATP, or,
better, what is a good ATP system?
 
Andrews: Use heuristics whenever possible.
 
Bundy: A big knowledge base will be important, and the ability
to transfer and adapt theorems.
 
Voronkov: The explanation of proofs is important. Can we understand
the output from Mizar? There are sometimes inconsistencies.
 
Colton: I use Otter because it's got a stable distribution, it's
easily available for Windows, has a manual and a great track record. These
are very important things.