On Mon, Aug 2, 2010 at 19:26, Max TenEyck Woodbury
<max at mtew.isa-geek.net> wrote:
> On 08/02/2010 01:04 PM, Gert van den Berg wrote:
>>> There seem to be several interfaces to retreive MSDN articles... Some
>> of those interface might be more stable / provide a way to retrieve a
>> current link? (Framing the content would probably not be allowed, but
>> retreiving links should be...)
>>> If it *looks* like copying, it should be avoided.
What I meant is, that it would allow for things like some really nice
side-to-side postings of Wine's documentation and MSDN... It is
unlikely to be legal though... (And should therefore be avoided)
> I think it will be necessary to regenerate articles as the Wine project
> evolves. I regenerated the 'dlls' page after Alexandre's CVS run today.
> There was some new content so I saved the result. I've regenerated the
> individual DLL pages several times as I improved the generating
> scripts. Both activities put a load on me that I am trying to automate.
> In fact, if you look at the original post before it got hijacked into a
> discussion of the project name, it asks about a way to improve that
> automation.
>> Since there will be fairly frequent semi-intelligent reviews of pages,
> such queries can probably be incorporated into that process.
>> Can you help me with this, please?
I can give an overview of what I think... I might do a bit of
implementation, but finishing something bigger than a short script
when I have other things to do isn't always easy...
Sourceforge provides quite decent hosting in the project hosting space
as well... Automating things from there might be a good idea.
To handle MSDN, I can see the following (very) high level process:
Posted MSDN links gets converted to point to redirect system in the
project space. (The conversion retrieves and save some additional
data). The redirect system will redirect the user to relevant content
on current MSDN (allowing user to configure some parameters, such as
view type / language)
What I managed to figure out about MSDN content for the Web services
documentation this far:
1. Content is identified by a Content identifier, which can be any of
the following:
a. GUID
b. Short ID (Short (~8 character) unique identifier such as ms224917)
c. Alias ("Friendly string")
d. asset ID (Documented here:
http://msdn.microsoft.com/en-us/magazine/cc163541.aspx Can be used to
retrieve other (non-URL) identifiers)
e Content URL (The URL to an MSDN page)
2. A Content key uniquely identifies an article and consists of a
content identifier, locale and version
3 The version.identifies the various versions of the documented item.
e.g. the .NET version / VC++ version
4. Locale identifies preferred language
More detailed version:
1. Find MSDN URL and retrieve other identifiers (GetContent without a
locale seem to retreive almost all the metadata ("partial match" in
the docmentaiton))
2. Store data in database, with GUID / short ID / assetID as primary
key (they are always-present unique identifiers) other fields include
alias, current URL. (content table)
3. Other tables: Locales and versions. Mapping tables between content
and possible locales and versions hould also exist. The content table
should save when the data was last updated and it should be
automatically updated once it reaches a certain age and an user wants
to retreive the page / select a non-default locale / version.
4. Generate new link pointing to the redirector system. The redirector
system needs the unique content identifier (GUID / shortID / assetID)
and preferably a version. It should be able to take a parameter to
preview the URL (probably a smaller "options link") and allow the user
to choose other locales / versions and generate links to those. This
"content options" page should also allow problems with the
redirector's link to be reported. The reports should be used to update
the information for that link by querying it again from MSDN.
It might require several trips to MSDN to initially add an item / to
update versions and locales. This should possibly scheduled once the
data get really old (somewhere between 180 days and 1 year) and be run
on demand when someone requests the content from somewhere where they
might want to see the other versions / locales about half the
scheduled update age. This should keep the request volumes to MSDN
low...
MSDN changes and updates required:
1. Changes to web services interface: Update relevant parts (This
shouldn't be too serious as long as the entire interface is not
redesigned)
2. Link changes: Change how the URLs are generated from the
identifiers in the database (I don't see an easy way to request an URL
from the web services interface)
Other possible extensions: The "redirector" can build a more complete
view of what is available to allow documentation to be found easier
(the information should only be retrieved on demand and saved) and to
prevent duplicate trips to MSDN. (This should eventually provide a
nice "tree" of MSDN with reliable links to the content)
Gert