Rebol Talk Forum  |  Getting Started  |  Newbie Help  |  Topic: Strange behavior with parses
Pages: [1] Print
Author Topic: Strange behavior with parses  (Read 502 times)
Grundle
Newbie
*
Offline Offline

Posts: 4


View Profile
Strange behavior with parses
« on: August 26, 2004, 10:35:32 AM »

I am working on a very simple xml data extraction script in console.  I have been getting some strange results.  I am basically opening a large xml dictionary file and I want to go through and parse-xml each xml entry and then use the to-rebol-data script to "beautify" the results.  My problem comes when I pass the data to the to-parse-data script

Looking at the following code

Code:
to-rebol-data: func [block /local out] [
out: copy []
foreach [tag attr body] block [
 append out to-word tag
 foreach item body [
  either block? item [
   append/only out to-rebol-data item
  ][
   if not empty? trim item [append out item]
  ]
 ]
]
out
]

xmlfile: read %Research/xml_files/gcide_a.xml
nDef: find xmlfile "<p><hw>"
tst: parse-xml nDef
result: to-rebol-data tst

At this point the script will throw an error saying the following

Code:
>> do %dictionary_recognizer.r
Script: "A GCIDE dictionary 'parser'" (25-Aug-2004)
** Script Error: foreach expected data argument of type: series
** Where: to-rebol-data
** Near: foreach item body [

The problem I have is that I have run tests on the data I am inputing and I get the following output

Code:
>> block? tst
== true
>> series? tst
== true

I am very confused.  It says it needs a series, and yet when I check to see if it is a series it says true.  Does anyone have any idea how I can move forward with this?
Thanks
« Last Edit: August 26, 2004, 02:27:41 PM by Grundle » Logged
Gregg
Newbie
*
Offline Offline

Posts: 26


View Profile WWW
Strange behavior with parses
« Reply #1 on: August 26, 2004, 11:03:25 AM »

If you look at the near: part of the error, it appears that BODY is the problem in the innter FOREACH loop.
Logged
Grundle
Newbie
*
Offline Offline

Posts: 4


View Profile
Strange behavior with parses
« Reply #2 on: August 26, 2004, 11:36:08 AM »

UPDATE:

Here is the answer boys and girls.  The problem comes from the following XML structure.  Take a good look at the </br> tag.  For some reason this little gem is breaking the  to-rebol-data script and my guess is because it is breaking up the symmetrical structure of the XML statement.  Most likely it is treating it as a block and since </br> does not have a closing brace then it messes things up.  I love searching for hours for such a silly problem...of course it doesn't help that I am still Rebol ignorant (heh).  

Code:
<p><hw>0</hw> <pos>adj.</pos> <sn>1.</sn>  <def>indicating the absence of any or all units under consideration; -- representing the number zero as an Arabic numeral</def><br/>
<syn><b>Syn. --</b> zero</syn><br/>
[<source>WordNet 1.5</source> <source>+PJC</source>]</p>

I find it interesting that the function parse-xml handles this structure fine, but to-rebol-data fails miserably.  It leads me to believe that the to-rebol-data should be extended to handle cases such as these, although it might have been intentional to leave it out (could be more trouble than it is worth).
:END UPDATE
« Last Edit: August 26, 2004, 03:43:57 PM by Grundle » Logged
Sunanda
Full Member
***
Offline Offline

Posts: 100


View Profile
Strange behavior with parses
« Reply #3 on: August 29, 2004, 06:11:12 PM »

If you are working in XML, then be kind to yourself and checkout some of Gavin McKenzie's work.

Unfortunately, he didn't park it in REBOL.org and his website no longer seems to exist. Someone else may be able to point you in a better direction, but the best I can offer is an appendix in an academic paper:

http://singe.rucus.net/honours/files/paper/thesis.pdf

It's also worth checking out the various mailing list exchanges:

http://www.rebol.org/cgi-bin/cgiwrap/rebol...E24485753CE736}
Logged
singe
Guest


Email
Strange behavior with parses
« Reply #4 on: August 31, 2004, 09:10:08 PM »

That academic paper is mine. If you would like the sources I still have Gavin's stuff and it can currently be found at http://singe.rucus.net/utils/rebol/
Logged
Pages: [1] Print 
Rebol Talk Forum  |  Getting Started  |  Newbie Help  |  Topic: Strange behavior with parses
Jump to:  

  
Quick Search...

Advanced search
  
Welcome, Guest. Please login or register.
Did you miss your activation email?
September 30, 2008, 06:29:00 PM
Username: Password: Session Length:
  

News: 01-09-08

Alpha version of REBOL 3 has been released!


  
2243 Posts in 587 Topics by 2010 Members
Latest Member: techpon

  Rebol Talk Forum | Powered by SMF 1.0.9.
© 2001-2005, Lewis Media. All Rights Reserved.

RT design by Defiant Pc