Ticket #7 (closed defect: fixed)

Opened 3 years ago

Last modified 11 months ago

Error: cannot open regex file de/4.0.regex

Reported by: hpop Owned by: deveiant
Priority: critical Milestone:
Version: 1.0.4 Keywords: locale de regex en dictionary
Cc:

Description

I tried to compile linkParser with link-grammar 4.5.7 (and the svn version ) on mac OSX 10.5.7. The rake command fails with a lot of errors (attached). But the one in the headline seams to be the importand one.

Attachments

rake errors.txt (31.4 KB) - added by hpop 3 years ago.
rake errors

Change History

Changed 3 years ago by hpop

rake errors

comment:1 Changed 3 years ago by deveiant

  • Owner set to deveiant
  • Status changed from new to assigned
  • Version set to 1.0.4

Wow, interesting; I'll try to track down what's going on. I don't even know that I've seen the de/4.0.regex file in the link-grammar distribution before.

Thanks for the report!

comment:2 Changed 3 years ago by hpop

Thanks a lot. There is experimental German support in link-grammar and it would be great if I could test this. Maybe it is looking up the German dictionary because I'm in Germany? It would be odd because my system language is set to english..

comment:3 Changed 3 years ago by hpop

Oh and if you need more informations or help: just write a mail or post it here.

comment:4 Changed 3 years ago by deveiant

  • Keywords locale de regex en dictionary added

Okay, I tried upgrading the version of link-grammar that I test against to the latest on the Abiword site (4.5.7), and it still doesn't error for me.

I suspect it is happening because of some locale setting on your system which is still set to 'de'. You can test to see what locale you're set to (on UNIX-ish systems anyway) by running locale. My system, for example, shows:

$ locale
LANG="en_US.UTF-8"
LC_COLLATE="en_US.UTF-8"
LC_CTYPE="en_US.UTF-8"
LC_MESSAGES="en_US.UTF-8"
LC_MONETARY="en_US.UTF-8"
LC_NUMERIC="en_US.UTF-8"
LC_TIME="en_US.UTF-8"
LC_ALL=

Link Grammar apparently does try to load a dictionary that's appropriate for your locale; from api.c in the link-grammar source:

Dictionary dictionary_create_default_lang(void)
{
    Dictionary dictionary;
    char * lang;

    lang = get_default_locale();
    if(lang && *lang) {
        dictionary = dictionary_create_lang(lang);
        free(lang);
    } else {
        /* Default to en when locales are broken (e.g. WIN32) */
        dictionary = dictionary_create_lang("en");
    }

    return dictionary;
}

and the Ruby binding calls dictionary_create_default_lang() if you don't specify a language. You can work around the error for your own use by explicitly creating an English dictionary:

dict = LinkParser::Dictionary.new( 'en' )

I'll be patching the specs to explicitly use an English dictionary to prevent this from happening to other people running on systems with non-English locales, and I'll let the Link Grammar maintainers known that the 'de' data is broken.

Thanks again for your report!

comment:5 Changed 3 years ago by hpop

Yes my locale is set to:

LANG="de_DE.UTF-8"
LC_COLLATE="de_DE.UTF-8"
LC_CTYPE="de_DE.UTF-8"
LC_MESSAGES="de_DE.UTF-8"
LC_MONETARY="de_DE.UTF-8"
LC_NUMERIC="de_DE.UTF-8"
LC_TIME="de_DE.UTF-8"
LC_ALL=

I was hoping to use LinkParser? with the experimental German dictionary. Is there any chance to accomplish that?

comment:6 Changed 3 years ago by deveiant

I just verified that it is indeed a locale setting:

$ LANG=de_DE.UTF-8 rake spec
(in /Users/ged/source/ruby/LinkParser)
Entering ext
make
make: Nothing to be done for `all'.
Leaving ext

bugfix for #3: The first linkage for "The cat runs."
Error: cannot open regex file de/4.0.regex
- thinks cat is the subject (FAILED - 1)
Error: cannot open regex file de/4.0.regex
- thinks runs is the verb (FAILED - 2)

LinkParser::Dictionary
Error: cannot open regex file de/4.0.regex
- can be instantiated using all default values (FAILED - 3)
Error: cannot open regex file de/4.0.regex
- can be instantiated with an options hash (FAILED - 4)
[...]

This means that you can run the specs as-is by running them like so:

LANG=en_US.UTF-8 rake spec

In the next release, the locale will be ignored for the current specs, and I'll see if I can figure out how to make the German dictionary load despite the missing file.

You can explicitly (try) to load the German dictionary like so:

de_dict = LinkParser::Dictionary.new( 'de' )

Since the regex file (and it looks like it's missing de/4.0.knowledge, too) is missing from the link-grammar source, there's nothing I can do really to make it work, unfortunately.

comment:7 Changed 3 years ago by deveiant

  • Status changed from assigned to closed
  • Resolution set to fixed

(In [58]) Fixed problems with the specs run on systems with non-English locales (fixes #7).

comment:9 Changed 11 months ago by Michael Granger <ged@…>

In [7e5a556c93fd714b4f3e0c777caabacdc6fb3d45]:

Fixed problems with the specs run on systems with non-English locales (fixes #7).

Note: See TracTickets for help on using tickets.