geldige emailadressen filteren

menno · 16 mei 2021

Ik werd ineens geconfronteerd met duitse emailadressen met daarin karakters zoals: äößü die tegenwoordig gewoon geldig zijn (maar niet alle mailservers kunnen daar mee overweg ).

Er zijn diverse plug-ins waar je emailadressen mee kan valideren, maar soms is dat niet handig en je kan daar niks in "sturen". Vandaag dus maar even een instelbaar filtertje gemaakt. Het maakt gebruik van 2 while-functies, dus je kan het alleen toepassen vanaf filemaker 18.

Als je dit als CF definiëert, dan kan je die functie aanroepen met één of meer emailadressen. De adressen mogen worden gescheiden met ; , of een return en alleen de geldige emailadressen staan in het resultaat.

De informatiebronnen die ik heb gebruikt zijn:
Syntax emailadressen: https://en.wikipedia.org/wiki/Email_address
Karakterlijsten: https://sites.psu.edu/symbolcodes/languages

Zie voor de functie mijn volgende entry

bigbadwolf · 17 mei 2021

Dank… zal er dankbaar gebruik van maken.

Eigenlijk wel een logistieke misser dat deze tekens ineens wel toegestaan zijn… tuurlijk moet iedereen ervoor zorgen dat de software up-to-date is, maar als je na zoveel jaar ‘ineens’ besluit dat deze tekens toch eigenlijk ook wel moeten kunnen…

Wat is de volgende stomiteit? Onderscheid maken tussen kaptiaal en onderkast…?

Marsau · 17 mei 2021

Dank je wel, Menno. Deze gaat mijn 'library' in.

menno · 17 mei 2021

Ik heb de controle op de verplichte/maximale lengten toegevoegd: lengte lokale deel max. 64, totale lengte max. 254 en lengte TLD minimaal 2. Verder wordt nu gecontroleerd of het TLD uit alleen letters bestaat:

Let ( [ 

	addresslist = Substitute ( emailaddresses ; [ ";" ; ¶ ] ; [ "," ; ¶ ] ) ; 
	found = ValueCount ( addresslist ) ; 

	specialchars = ".!#$%&'*+-/=?^_`{|}~0123456789@" ; 
	regularchars = 
		While ( [ 
			i = 64 ; 
			include = "" ; /* empty or Greek or Cyrillic (Russia, Ukraine, Belarus, Serbia) */
			regular = ""   /* No numbers, signs or symbols */
		] ; 
			i < Case ( 
					IsEmpty ( include ) ; 382 ; 	/* European excl. Greek and Cyrillic */
					include = "Greek" ; 969 ; 	/* European + Greek */
					include = "Cyrillic" ; 1119 ;   /* European + Greek + Cyrillic */
					0 )
		; [ 
			i = i + Case ( 
					i = 90 ; 7 ; 	/* Latin capital */
					i = 122 ; 70 ; 	/* Latin lowercase */
					i = 214 ; 2 ; 	/* Diacritics capital */
					i = 246 ; 2 ; 	/* Diacritics lowercase */
					i = 382 ; 531 ; /* Greek capital */
					i = 937 ; 8 ; 	/* Greek lowercase */
					i = 969 ; 55 ; 	/* Russian. Ukrainian, Serbian and Belarussian */
					1 ) ; 
			regular = regular & Char ( i )
		] ; 
			regular 
		) ; 
	allowedchars = specialchars & regularchars ; 

	validaddresses = 
		While ( [ 
			result = "" ; 
			x = 0  
		] ; 
			x < found 
		; [ 
			x = x + 1 ; 
			a = Trim ( GetValue ( addresslist ; x ) ) ; 
			b = Filter ( a ; allowedchars ) ; 
			c = GetValue ( FilterValues ( a ; b ) ; 1 ) ; /* may not be empty */
			d = Right ( c ; Length ( c ) - Position ( c ; "@" ; Length ( c ) ; -1 ) ) ; /* Fqdn */
			e = Left ( c ; Position ( c ; "@" ; 1 ; 1 ) - 1 ) ;                         /* Localpart */
			f = Right ( d ; Length ( d ) - Position ( d ; "." ; Length ( d ) ; -1 ) ) ; /* TLD */

			g = ( Length ( d ) - Position ( d ; "." ; Length ( d ) ; -1 ) >= 2 ) ; 
			h = ( PatternCount ( c ; ".." ) = 0 ) ; 
			i = ( PatternCount ( c ; "@" ) = 1 ) ;  
			j = PatternCount ( d ; "." ) ;  
			k = ( Left ( d ; 1 ) ≠ "." ) ;  
			l = ( Left ( e ; 1 ) ≠ "." ) ;  
			m = ( Right ( d ; 1 ) ≠ "." ) ; 
			n = ( Right ( e ; 1 ) ≠ "." ) ; 
			o = ( Length ( c ) <= 254 ) ; 
			p = ( Length ( e ) <= 64 ) ; 
		 	q = Exact ( f ; Filter ( f ; regularchars ) ) ; 

			result = List ( result ; If ( g and h and i and j and k and l and m and n and o and p and q ; a ) ) 
		] ; 
			result 
		) 

] ; 

	validaddresses  

)

/* 

More info on valid characters

Syntax emailaddresses source: 
https://en.wikipedia.org/wiki/Email_address

Characterlists source: 
https://sites.psu.edu/symbolcodes/languages

64 - 90 Cap. Europe
97 - 122 Low. Europe 
192 - 214 Cap. Diacritics
216 - 246 Low. Diacritics
248 - 382 Oth. Europe
913 - 937 Cap. Greece
945 - 969 Low. Greece 
1024 - 1071 Cap. Russia, Ukraine, Serbia, Belarus
1072 - 1119 Low. Russia, Ukraine, Serbia, Belarus

*/

Marsau · 20 maart 2024

Het is een mooie functie. Ik had problemen met een mailserver dat geen "ë" in het adres accepteert. De CF haalt deze er niet uit en het is mij niet duidelijk hoe je dan dat voorkomt. Ik gebruik dan een andere functie om de alle accenten eruit te halen.

Verder struikelt de functie als je adressen hebt als "Voornaam Achternaam <mail@ergens.com>".

Ik heb een kleine toevoeging gemaakt dat in dit geval het adres splitst, het mailadres-gedeelte toetst en indien ok, het volledige adres herstelt.

Let ( [ 

	addresslist = Substitute ( MailSendItems::Email ; [ "; " ; ¶ ] ; [ ";" ; ¶ ] ; [ ", " ; ¶ ]; [ "," ; ¶ ] ) ; 
	found = ValueCount ( addresslist ) ; 

	specialchars = ".!#$%&'*+-/=?^_`{|}~0123456789@" ; 
	regularchars = 
		While ( [ 
			i = 64 ; 
			include = "" ; /* empty or Greek or Cyrillic (Russia, Ukraine, Belarus, Serbia) */
			regular = ""   /* No numbers, signs or symbols */
		] ; 
			i < Case ( 
					IsEmpty ( include ) ; 382 ; 	/* European excl. Greek and Cyrillic */
					include = "Greek" ; 969 ; 	/* European + Greek */
					include = "Cyrillic" ; 1119 ;   /* European + Greek + Cyrillic */
					0 )
		; [ 
			i = i + Case ( 
					i = 90 ; 7 ; 	/* Latin capital */
					i = 122 ; 70 ; 	/* Latin lowercase */
					i = 214 ; 2 ; 	/* Diacritics capital */
					i = 246 ; 2 ; 	/* Diacritics lowercase */
					i = 382 ; 531 ; /* Greek capital */
					i = 937 ; 8 ; 	/* Greek lowercase */
					i = 969 ; 55 ; 	/* Russian. Ukrainian, Serbian and Belarussian */
					1 ) ; 
			regular = regular & Char ( i )
		] ; 
			regular 
		) ; 
	allowedchars = specialchars & regularchars ; 

	validaddresses = 
		While ( [ 
			result = "" ; 
			x = 0  
		] ; 
			x < found 
		; [ 
			x = x + 1 ; 
			probe = Trim ( GetValue ( addresslist ; x ) ) ; 

			a = Let ( 
					 [ 
						p = Position ( probe ; " <"; 1; 1 ) ; 
						r = Right ( probe; 1 ) = ">" and p 
					 ] ; 

						If ( r ; GetValue ( Replace ( Left ( probe; Length ( probe ) - 1 ) ; p ; 2 ; ¶ ); 2 ) ; probe ) );

			b = Filter ( a ; allowedchars ) ; 
			c = GetValue ( FilterValues ( a ; b ) ; 1 ) ; /* may not be empty */
			d = Right ( c ; Length ( c ) - Position ( c ; "@" ; Length ( c ) ; -1 ) ) ; /* Fqdn */
			e = Left ( c ; Position ( c ; "@" ; 1 ; 1 ) - 1 ) ;                         /* Localpart */
			f = Right ( d ; Length ( d ) - Position ( d ; "." ; Length ( d ) ; -1 ) ) ; /* TLD */

			g = ( Length ( d ) - Position ( d ; "." ; Length ( d ) ; -1 ) >= 2 ) ; 
			h = ( PatternCount ( c ; ".." ) = 0 ) ; 
			i = ( PatternCount ( c ; "@" ) = 1 ) ;  
			j = PatternCount ( d ; "." ) ;  
			k = ( Left ( d ; 1 ) ≠ "." ) ;  
			l = ( Left ( e ; 1 ) ≠ "." ) ;  
			m = ( Right ( d ; 1 ) ≠ "." ) ; 
			n = ( Right ( e ; 1 ) ≠ "." ) ; 
			o = ( Length ( c ) <= 254 ) ; 
			p = ( Length ( e ) <= 64 ) ; 
		 	q = Exact ( f ; Filter ( f ; regularchars ) ) ; 

			result = List ( result ; Case ( not ( g and h and i and j and k and l and m and n and o and p and q ) ; ""; probe <> a; probe; a ) ) 
		] ; 
			result 
		) 

] ; 

	validaddresses  

)

menno · 21 maart 2024

Ik denk dat ie zonder diakrieten zo moet worden:

Let ( [ 

	addresslist = Substitute ( MailSendItems::Email ; [ "; " ; ¶ ] ; [ ";" ; ¶ ] ; [ ", " ; ¶ ]; [ "," ; ¶ ] ) ; 
	found = ValueCount ( addresslist ) ; 

	specialchars = ".!#$%&'*+-/=?^_`{|}~0123456789@" ; 
	regularchars = 
		While ( [ 
			i = 64 ; 
			include = "" ; /* empty or Greek or Cyrillic (Russia, Ukraine, Belarus, Serbia) */
			regular = ""   /* No numbers, signs or symbols */
		] ; 
			i < Case ( 
					IsEmpty ( include ) ; 122 ; 	/* European no diacritics */
					include = "Greek" ; 969 ; 	/* European + Greek */
					include = "Cyrillic" ; 1119 ;   /* European + Greek + Cyrillic */
					0 )
		; [ 
			i = i + Case ( 
					i = 90 ; 7 ; 	/* Latin capital */
					i = 122 ; 70 ; 	/* Latin lowercase */
					i = 214 ; 2 ; 	/* Diacritics capital */
					i = 246 ; 2 ; 	/* Diacritics lowercase */
					i = 382 ; 531 ; /* Greek capital */
					i = 937 ; 8 ; 	/* Greek lowercase */
					i = 969 ; 55 ; 	/* Russian. Ukrainian, Serbian and Belarussian */
					1 ) ; 
			regular = regular & Char ( i )
		] ; 
			regular 
		) ; 
	allowedchars = specialchars & regularchars ; 

	validaddresses = 
		While ( [ 
			result = "" ; 
			x = 0  
		] ; 
			x < found 
		; [ 
			x = x + 1 ; 
			probe = Trim ( GetValue ( addresslist ; x ) ) ; 

			a = Let ( 
					 [ 
						p = Position ( probe ; " <"; 1; 1 ) ; 
						r = Right ( probe; 1 ) = ">" and p 
					 ] ; 

						If ( r ; GetValue ( Replace ( Left ( probe; Length ( probe ) - 1 ) ; p ; 2 ; ¶ ); 2 ) ; probe ) );

			b = Filter ( a ; allowedchars ) ; 
			c = GetValue ( FilterValues ( a ; b ) ; 1 ) ; /* may not be empty */
			d = Right ( c ; Length ( c ) - Position ( c ; "@" ; Length ( c ) ; -1 ) ) ; /* Fqdn */
			e = Left ( c ; Position ( c ; "@" ; 1 ; 1 ) - 1 ) ;                         /* Localpart */
			f = Right ( d ; Length ( d ) - Position ( d ; "." ; Length ( d ) ; -1 ) ) ; /* TLD */

			g = ( Length ( d ) - Position ( d ; "." ; Length ( d ) ; -1 ) >= 2 ) ; 
			h = ( PatternCount ( c ; ".." ) = 0 ) ; 
			i = ( PatternCount ( c ; "@" ) = 1 ) ;  
			j = PatternCount ( d ; "." ) ;  
			k = ( Left ( d ; 1 ) ≠ "." ) ;  
			l = ( Left ( e ; 1 ) ≠ "." ) ;  
			m = ( Right ( d ; 1 ) ≠ "." ) ; 
			n = ( Right ( e ; 1 ) ≠ "." ) ; 
			o = ( Length ( c ) <= 254 ) ; 
			p = ( Length ( e ) <= 64 ) ; 
		 	q = Exact ( f ; Filter ( f ; regularchars ) ) ; 

			result = List ( result ; Case ( not ( g and h and i and j and k and l and m and n and o and p and q ) ; ""; probe <> a; probe; a ) ) 
		] ; 
			result 
		) 

] ; 

	validaddresses  

)

Je hoeft alleen de regularchars aan te passen

menno · 21 maart 2024

De regularchars is het deel waarin je de toegelaten karakters bepaalt. Ik heb dat deel even aangepast, in de variable 'include' kies je dan welke deel van het karakter-pallet je wilt toelaten, hier heb ik nu ingevuld "Diacritic" , maar als je die leeghaalt, dan hou je allen de 'normale' karakters over:

		While ( [ 
			i = 64 ; 
			include = "Diacritic" ; /* empty or Diacritic or Greek or Cyrillic (Russia, Ukraine, Belarus, Serbia) */
			regular = ""   /* No numbers, signs or symbols */
		] ; 
			i < Case ( 
					IsEmpty ( include ) ; 122 ; 	/* European no diacritics */
					include = "Diacritic" ; 255 ; /* European including diacritics */
					include = "Greek" ; 969 ; 	/* European + Greek */
					include = "Cyrillic" ; 1119 ;   /* European + Greek + Cyrillic */
					0 )
		; [ 
			i = i + Case ( 
					i = 90 ; 7 ; 	/* Latin capital */
					i = 122 ; 70 ; 	/* Latin lowercase */
					i = 214 ; 2 ; 	/* Diacritics capital */
					i = 246 ; 2 ; 	/* Diacritics lowercase */
					i = 382 ; 531 ; /* Greek capital */
					i = 937 ; 8 ; 	/* Greek lowercase */
					i = 969 ; 55 ; 	/* Russian. Ukrainian, Serbian and Belarussian */
					1 ) ; 
			regular = regular & Char ( i )
		] ; 
			regular 
		)

Marsau · 28 maart 2024

Dank, Menno. Het werkt!
Maar wie had anders verwacht? 😃

Inloggen

geldige emailadressen filteren

Aanbevolen berichten

menno

bigbadwolf

Marsau

menno

Marsau

menno

menno

Marsau

Doe mee aan dit gesprek

Overig

Activiteit