The Automated Reasoning area of CologNet was a sponsor of a joint panel at the Second Workshop on the Role of Automated Deduction in Mathematics (see http://www.dai.ed.ac.uk/homes/simonco/conferences/CADE02/ ) and the CADE-18 Workshop on Problems and Problem Sets (see http://floc02.diku.dk/PaPS/ ). This was held alongside FLOC-02 (see http://floc02.diku.dk/)in Copenhagen on July 31st 2002. The panel is on "Challenge Problems for Automated Deduction". Simon Colton kindly wrote the following summary of the discussion. Panel Members: Alan Bundy, Bill McCune, Peter Andrews, John Harrison Chair: Simon Colton Format: 10 minute opening remarks for each panel member, followed by an hour of general discussion, open to the floor. Initial Statements ------------------ Simon Colton opened the panel session by stating that - in his view - there are two problems halting the uptake of automated reasoning systems in mathematics: (i) the "low-hanging fruit problem", whereby ATP research has found profitable areas in verification, and, even though many people are interested in proving theorems from mainstream maths, they are paid to produce verification proofs, and (ii) the "you call that a theorem" problem, whereby many things that are counted as theorems in automated reasoning (e.g., the associativity of plus) are so trivially true to mathematicians, that the automated proof of these theorems seems very unimpressive. Peter Andrews said that major contributions to mathematics could be made by ATP systems able to help referees of mathematical papers by checking proofs. One prerequisite for making such systems practical is to develop a formal mathematics library on a large scale. As part of this project he suggested formalising the six books of Bourbaki's "Elements of Mathematics". He expected that the input of the material would start off in an interactive fashion, but hoped that it would become increasingly automatic as the system learned to prove things using the material already input. Bill McCune followed with a discussion of the usage of theorem provers as deductive support to creativity. He talked about his collaboration with the mathematician Padmanabhan, in particular an example with cancellative subgroups. Padmanabhan was turning to Otter to prove theorems as part of the discovery process. However, it was noted that, while Padmanabhan supplied the theorems, McCune employed the prover, and Bill laid down the challenge of making powerful provers such as Otter more interactive and easier to use. Alan Bundy argued that proving ever more difficult theorems might not be the best approach to an increased involvement of ATPs in mathematics. He suggested that it would be better to find a place in the work strategy of mathematicians, and his challenge was to increase the uptake of theorem provers by mathematicians. Two possibilities for this are to smuggle in provers via computer algebra systems and in mathematics education. Bundy suggested that a good testbed is Non-standard analysis, which members of his research team are working on. The advantage to this is that it could help teach students the notion of proof, perhaps without even knowing it. John Harrison ended the opening remarks by talking about the challenge of machine checking mathematics. He discussed the Mizar system developed in Poland, noting that the declarative style it employs is very easy to use. Mizar will tell you that the structure of a proof is valid, even if it cannot justify single steps in it, and Mizar relies on its existing library. He contrasted this with ATPs, which are very weak by design, because they rarely re-use previously proved assumptions. Some Discussion Tracks Which Followed ------------------------------------- Colton to McCune: How did Padmanabhan come on board? McCune: He just contacted me. The problem was never with the power of Otter, but with getting Padmanabhan to use it. Andrews: Do you want to change his apathy towards using Otter? Andrei Voronkov: Can you supply software for more general mathematics? Bundy to Voronkov: We should apply systems to what we are already good at. Do you ever use Vampire to prove any of your research results? Voronkov: But Vampire is not for mathematics, it is for combinatoric problems only. Bundy: Is there any hope for Vampire in maths? Voronkov: Possibly.... ----------------- Colton: What about Zeilberger's claim that the only interesting maths cannot be proved? Everyone: general apathy towards that question: surely we shouldn't take that seriously! Colton: OK then. ----------------- Audience member: If you're talking about mathematicians writing their own provers, then this is very costly - they could just go out and prove theorems. Andrews: But science can't predict when particular work is going to find an application. We should pursue advances wherever possible. Bundy: But doesn't most of the work go into the application? There is a huge amount of work on verification. Harrison: And verification really does rely on mathematics - it is very useful to have maths libraries available. ------------------ John Slaney: We shouldn't be too defensive about our achievements, there's nothing wrong with low-hanging fruit. Bundy: Automated verification of proofs is a worthy overall goal, but a very long term goal, and we don't want to go into that prematurely. Slaney: Well, I've verified some papers for publication. Bundy: Then can we offer a service for a particular journal? Harrison: Or maybe it would be better to trawl some maths databases looking for large numbers of theorems which are easy enough to verify. Chris Benzmuller: If we are checking proofs, then shouldn't we be discussing HCI issues, and looking at data on the design of maths GUIs. Bundy to Jeremy Gow: Isn't it a question of getting the question right: have you thought about a big empirical study? Gow: No, for the current project, we're not planning a large empirical study, but we would hopefully perform one in future. --------------------- Voronkov: We should be thinking about ways to enforce software on people. The microsoft way has been tried, and the Latex model (Knuth/Lamport) is another model. Harrison: Well, I was advocating the ideas behind Mizar, not necessarily that we all go out and become big users. Bundy: If I may repeat what I said earlier about education: this could be a good way to enforce the use of ATPs. ---------------------- Volker Sorge to Panel: In your view, what is wrong with ATP, or, better, what is a good ATP system? Andrews: Use heuristics whenever possible. Bundy: A big knowledge base will be important, and the ability to transfer and adapt theorems. Voronkov: The explanation of proofs is important. Can we understand the output from Mizar? There are sometimes inconsistencies. Colton: I use Otter because it's got a stable distribution, it's easily available for Windows, has a manual and a great track record. These are very important things.