Ticket #4 (closed defect: fixed)

Opened 7 years ago

Last modified 7 years ago

encoding email address fails (version 0.0.2)

Reported by: deveiant Owned by: deveiant
Milestone: Previous Releases Component: MarkdownSyntax
Version: 0.0.2 Severity: major
Keywords: old bugs Cc:

Description (last modified by deveiant) (diff)

hi

encoding <address@…> will not turn out <a href="&#x6D;&#x61;i&#x6C;&#x74;&#x6F ...> as expected

AutoAnchorEmailRegexp is missing the 'Extended Mode' modifier. and the call to unescape_special_char should be labeled unescape_special_chars ...

gruss
.ralf

Change History

comment:1 Changed 7 years ago by deveiant

  • Description modified (diff)

comment:2 Changed 7 years ago by deveiant

I'm currently on vacation without access to the code, but I'll fix it when I get back. Thanks for the report.

comment:3 Changed 7 years ago by deveiant

  • Status changed from new to closed
  • Resolution set to fixed

Date: 2004-04-29 07:17
Sender: ged
Logged In: YES
user_id=158

Confirmed bug with new tests for email encoding. Applied suggested fixes; will be released with 0.0.3.

comment:4 Changed 7 years ago by deveiant

  • Status changed from closed to reopened
  • Resolution fixed deleted

Date: 2004-05-04 13:13
Sender: None
Angemeldet: NEIN

ahhh - we missed uppercase URLs in the fix above :(

'fu@…' is still broken

related item: should

   HTMLTagOpenRegexp  = %r{ < [a-z/!$] [^<>]* }mx

have an 'i'-modifier ??

gruss
.ralf

comment:5 Changed 7 years ago by deveiant

  • Status changed from reopened to closed
  • Resolution set to fixed

Date: 2004-05-04 15:04
Sender: ged
Logged In: YES
user_id=158

You're correct, of course. I can't believe I didn't test upper case in the domain. Can you think of any other email-address variants that would be good to test? If you look at the 'Emails' constant in source:trunk/tests/05_Markdown.tests.rb#43, you can add more tests just by dropping email addresses into that Array.

The case-sensitive problem was confirmed by adding 'fu@…' to it, for example.

The fix is to add an 'i'-modifier, as you suggest, but to AutoAnchorEmailRegexp instead of !HTMLTagOpenRegexp (though it's probably a good idea to add it to the latter as well, so I've done that too).

Tests and fix in [43].

comment:6 Changed 7 years ago by gruss

Date: 2004-05-12 09:37
Sender: None
Angemeldet: NEIN

add an 'i'-modifier to !AutoAnchorEmailRegexp ...
good idea to add it to !HTMLTagOpenRegexp 

exactly what I had in mind :)
'When I struggle to be terse, I end up by being obscure.'

other email-address variants 

I came up with these variants that all passed the test:

	ll@lll.lllll.ll
	Ull@Ulll.Ulllll.ll
	UUUU1@UU1.UU1UUU.UU
	l@ll.ll
	Ull.Ullll@llll.ll
	Ulll-Ull.Ulllll@ll.ll
	1@111.ll

There is one problem ahead: Internationalized Domain Names (IDN) Since march(?) domains may contain 'umlaute' (äöü for example, as in info@öko.de - [I hope these show up correctly on your screen]) and these break the [a-z]-part of the regex

     <([-.\w]+\@[-a-z0-9]+(\.[-a-z0-9]+)*\.[a-z]+)>

92 additional letters seem to be acceptable. Maybe we should simply relax the regex to '...\@[-\w]...' There exists a ASCII Compatible Encoding (ACE) for domains (mapping büro.de to xn--bro-hoa.de for example) but we can not expect users to enter domains in ACE format :)

gruss
.ralf

Note: See TracTickets for help on using tickets.