From: Tom Johnson Date: October 1, 2007 4:23:03 EDT To: Subject: Re: netatalk conversion I have been VERY negligent in getting back to you on this.  I am sorry about that. I'll state right here that what we have is not a magic bullet "press this button and everything will work" solution.  Well, I wouldn't trust it that much anyway :-).  Also, missing one of the scripts (working on it) you are lacking data about what to do even though you have the tool (a  different script) to do the work.  You have here and in the linked files a LOT of information, but it needs parsing and not all of it applies to your circumstances.  I welcome questions and I expect them -as I said, this is a bit of a mess.   I hope you don't just throw up your hands in frustration/overwhelmed but it may indeed be the case that what follows is more effort/complexity than your situation justifies dealing with.  For us, we had files scattered across 60,000 user accounts to deal with, so the brute force instrument of copy all the files to a Mac, then back up to the server was simply impossible, so we dealt with the complexity. We have been partially successful in finding the scripts involved :-).  There are two scripts involved (well 2 main ones, and I have since found a 3rd that did a smaller task).  One was a perl script that read and manipulated the data in the files - appledouble.pl .  It was written to be generic so it doesn't know what layout it is starting with and what it is moving to.  Rather, what changes need to be made are passed to it as parameters and it does the work.  This proved a wise decision for us since we had to do this move twice in 2 years (once from Apple Double files written by an old Mac OS 6 and 7 extension called NFS/Share to those used by Net-A-Talk and then a year later from Net-A-Talk to Apple's ._{filename} layout).  It does contain comments, though I will have to leave it to you to decide how useful they are.  Many of them refer to stuff that was specific to the NFS/Share to Net-A-Talk move, which isn't what you are doing. The second script is the one that walked our file systems, found the files, dealt with renaming them, and passed the parameters to the appledouble script that would then manipulate them.  This one the UNIX admin has not found.  He was going to go hunting for it again this coming week (he has been side tracked by other projects and some vacation days).  You can live without this script - the perl one is the more complicated one.  However, this one is VERY useful for showing you what needs to be done since it has the file name conventions and the parameters for the appledouble script.  An email in the pile_of_appledouble_emails.txt  file may contain the parameters we used - "-v2 -r92 -s" (write Apple Double Version 2, flags 9 and 2 - in THAT order -, file name as a C string instead of a pascal string).  One of the comments in the appledouble.pl script - likely put in around the time we did the 2nd conversion (the Net-A-Talk to Apple SMB - so what you are doing) also lists those arguments.  So, even without the missing script you might have the paramaters you need. I have been going thru my old emails looking at the notes we wrote about all of this.  Some are from our first run of moving from NFS/Share to Net-A-Talk (which you don't have to worry about).  Others (and unfortunately, far fewer emails about it since we did more of that one by voice since we already had the scripts written from the previous year and were just tweaking) from the next year when we moved from Net-A-Talk to Apple's Apple Double.  When you read the comments in the scripts, you'll see a lot of stuff that refers to things we had to do in the NFS/Share to Net-A-Talk move  which you won't need to deal with.  So you will want to keep in mind that this appledouble.pl script can do FAR more than what you will need it to do. In gong thru my emails I found a 3rd script which changes file names from MacRoman character encodings to UTF8.  At the time we found that the text encoding that Net-A-Talk was using for non-ASCII characters was MacRoman.   I know that Net-A-Talk supports 4 different ways of writing file names.  On the OS X side, it was/is using UTF-8.  I don't remember a ton of the details on this point, but I have a few emails about it which are in the pile_of_appledouble_emails.txt file (see below). You'll want a copy of the AppleDouble spec.  You REALLY should double check stuff and make sure that things haven't changed since we did our work (they may have) and that you understand what our scripts do. I also compiled a long text file which is just a copy and paste of some of the more important/informative of our internal emails.   Not as good a "real" documentation, but should give a better idea what we were doing (and far better than my memory).  It will be sort of annoying to read, but hopefully it proves useful.  There are more emails - I grabbed what appear to be most important ones.  If you believe you need more of them or you have a reply message that didn't quote the original and I didn't catch that (I did check for that and there shouldn't be anything like that), let me know. All the files can be found here: http://web.ics.purdue.edu/~tjohnson/appledouble_work/ Currently, there is: appledouble.pl - the perl script that does the nitty gritty of the conversion. roman2utf8.pl - the perl script that did the MacRoman to UTF8 file name conversions pile_of_appledouble_emails.txt - a series of the emails we traded while working on this.  Contain useful info and insights  but also unneeded info that was specific to us AppleSingle:AppleDouble.html - a copy of the old Apple Documentation for the Apple Double format. Other Apple Double/Apple Single guides that might be useful are: http://users.phg-online.de/tk/netatalk/doc/Apple/v1/ http://www.nulib.com/library/FTN.e000023.htm A few major hurdles that we had to deal with that I remember.  This is in the general sense, not the specific solution sense.  These were all dealt with in our scripts. - Net-A-Talk and Apple write Apple double files, but they use different naming conventions for them, and different character encodings for the file names if they are non-ASCI characters. Net-A-Talk file convention: {filename} .AppleDouble/{filename} Apple file name convention {filename} ._{filename} - VERY annoying - they don't read the header to find out where data is, they ASSUME the data is at the offset they normally write it to, so go straight to that offset in the file.  This works fine if THEY wrote the file, but if some other party wrote the file and that data is at a different offset (properly indicated in the header though) it breaks because the header isn't read.  Faster since it saves one read on every single file open - which is likely the reason they both did this, but makes it harder to move files between different mechanisms like what you are doing. - Apple only wrote flags 09 (Finder Info) and 02 (Resource fork)  in its AppleDouble files.  That would be fine EXCEPT if the header contained any additional flags (and Net-A-Talk keeps a lot more Mac specific data so had more flags) it would puke and die.  So we had to dump any flags beyond those two.  Also ORDER of the flags matters, so we had to re-oder them! I don't know how many of these are still true - this was back in 2002 and 2003 with Mac OS X 10.2 and Net-A-Talk 1.3  something.  So you may want to map your own Net-A-Talk files and see if things are still the case. My old hand written notes about which flags are kept by which implementation of AppleDouble.  ORDER MATTERS! Net-A-Talk - AppleDouble V1 02 - Resource Fork - 05AE length 03 - Realname 04 - Comment 07 - (V1) 09 - Finder Info - 02 length Net-A-Talk - Apple Double V2 02 - Resource Fork 03 - Realname 04 - Comment 08 - 09 - Finder info 0b - ProDOS File Info 0d - AFP Short Name - 0 byte length 0e - AFP file info, attribute, etc - 4 byte length 0f - AFP directory ID - 4 byte length Apple SMB - Apple Double V2 09 - Finder info - 02 length 02 - Resource fork - 05AE length I also have some Hex print outs of the files with the header mapped.  However, as printouts, they aren't really easy to type up.  If you think it would be useful I can try scanning them. I hope that makes some sense.  Feel free to ask questions - they are welcome.  I know this is a bit of a mess/dump.  I am sorry about that.  I wish it was in an easier/simpler form.  I'll check in with the UNIX admin about trying to find the other script. Tom "Macintosh Doctor" Johnson palantir@purdue.edu Ok, we found the other perl script. Actually an archive of scripts but only one of them matters for you - - the file system walker and mover and conversion caller. Same URL as before. Once more, the .pl replaced by .txt to keep my web server happy. netatalk2afp.pl this is the script that I mentioned as missing in my previous email. This perl script walks the file system finding candidate files, moving and renaming them, running the MacRoman to UTF8 script on their file names, and running the appledouble.pl script on them to modify the metadata header to make things work. It is well commented and should prove quite useful to you. Indeed, with the three perl scripts in your hands, you likely could get a test run (very beta :-) together reasonably quickly - just adjusting any path assumption we make. Give it some test directory path to walk and play with. I created a test directory tree with all kinds of different files - some with resource forks, some without, various common type and creator codes, some with file extensions, some without, and all kinds of names including non-ASCII characters. We tared and gzipped it up and then used that archive as a source for test directories. We did all of this on Solaris boxes as well so you shouldn't need to clean up any of the subtleties that differentiate different flavors of UNIX. Now I just need to remember to scan the Hexdump (actually created using od -x but who is checking ;-) of the metadata/resource file and its mapping I have and give you those...