Full disclosure, this is for a class. I am having a bit of a problem with my OS.Everything works within the kernel, however when I try to run a program outside the kernel (the shell), I seem to experience a plethora of undefined behavior. I defined a custom service routine for interrupt 21, and it works fine in the kernel, however it seems to cause a processor panic when called from the shell. Loops seem to cause undefined behavior in the shell as well. I tried to get help with another programmer with this, and although he couldn't figure this out either he seemed to get a processor panic when he tried to do a loop in the shell (however I did not). Other interrupts seem to work fine though for the most part. I think this has something to do with the IVT, and more specifically interrupt 21, however I am unsure about this as when the ISR for interrupt 21 is not initialized and not called similar problems can still arise. Even if it does have something to do with the IVTI do not know how I would go about debugging it (tried looking through the memory view on the emulatorbut I am unsure if I was looking in the right place), and if its not the IVT I have no idea what it is.

I have been stuck on this issue for quite a while and need to move on in the assignment, (professor is unavailable) if anyone can help me figure out how to debug this or figure out what the problem is it would be very helpful, here is the os https://nofile.io/f/0ewTS042E9Y/OS.zip

The compiler is bcc (Bruce's C Compiler), all the build scripts included should work on linux and there are debug.bat and build.bat for Windows Subsystem for Linux.

Thank you - cgbsu

P.s For some reason it needs memcpy, even though I dont call it, there is no c standard library linked and I dont call it, but this seems to be some sort of optimization or something, so I implemented it, if anyone knows how to get bcc to stop doing this, please let me know. Iv wondered if it has something to do with this.

When the provided disk image loads your "shell" program, it places the stack inside the EBDA. (See here for details.) The BIOS relies on the EBDA having specific contents, and it will misbehave if they're overwritten. Additionally, when the BIOS writes to the EBDA, it may be overwriting your program's stack.

I also saw some self-modifying code that doesn't clear the prefetch queue. It may fail on some CPUs.

I'm not sure if there are any other problems; I can't debug any further with an unreliable stack.

When the provided disk image loads your "shell" program, it places the stack inside the EBDA. (See here for details.) The BIOS relies on the EBDA having specific contents, and it will misbehave if they're overwritten. Additionally, when the BIOS writes to the EBDA, it may be overwriting your program's stack.

I also saw some self-modifying code that doesn't clear the prefetch queue. It may fail on some CPUs.

I'm not sure if there are any other problems; I can't debug any further with an unreliable stack.

I am unsure if I am overwiting anything, however we don't enter protected mode and he said the following:

"The segment should be a multiple of 0x1000 (remember that a segment of 0x1000 means a base memory location of 0x10000). 0x0000 should not be used because it is reserved for interrupt vectors. 0x1000 also should not be used because your kernel lives there and you do not want to overwrite it. Segments above 0xA000 are unavailable because the original IBM-PC was limited to 640k of memory."

I assume nothing aside from the regions he mentioned has anything that could easily be corrupted. I have experimented with changing the segment but not to much avail.

I am unsure if I am overwiting anything, however we don't enter protected mode and he said the following:

I am very certain that you're overwriting the EBDA. Since you're not switching to protected mode, the BIOS interrupt handlers can still access the EBDA. (And on real hardware, the BIOS will use SMM to access the EBDA regardless of CPU mode.)

cgbsu wrote:

"The segment should be a multiple of 0x1000 (remember that a segment of 0x1000 means a base memory location of 0x10000).

That's not a requirement of the hardware, but it makes it easier to keep track of which portions of memory you're using and avoids trouble with ISA DMA.

cgbsu wrote:

0x0000 should not be used because it is reserved for interrupt vectors.

It also contains the BDA, another structure that you must not overwrite.

cgbsu wrote:

I assume nothing aside from the regions he mentioned has anything that could easily be corrupted.

Your assumption is incorrect. Your simulator is using the Bochs BIOS, which places the EBDA at address 0x9FC00 by default. (The location may change depending on how it's configured.)

cgbsu wrote:

I have experimented with changing the segment but not to much avail.

That means there are additional problems, so I've decided to take another look. Your shell program returns from main! How can it return with no return address on the stack?

I am unsure if I am overwiting anything, however we don't enter protected mode and he said the following:

I am very certain that you're overwriting the EBDA. Since you're not switching to protected mode, the BIOS interrupt handlers can still access the EBDA. (And on real hardware, the BIOS will use SMM to access the EBDA regardless of CPU mode.)

cgbsu wrote:

"The segment should be a multiple of 0x1000 (remember that a segment of 0x1000 means a base memory location of 0x10000).

That's not a requirement of the hardware, but it makes it easier to keep track of which portions of memory you're using and avoids trouble with ISA DMA.

cgbsu wrote:

0x0000 should not be used because it is reserved for interrupt vectors.

It also contains the BDA, another structure that you must not overwrite.

cgbsu wrote:

I assume nothing aside from the regions he mentioned has anything that could easily be corrupted.

Your assumption is incorrect. Your simulator is using the Bochs BIOS, which places the EBDA at address 0x9FC00 by default. (The location may change depending on how it's configured.)

cgbsu wrote:

I have experimented with changing the segment but not to much avail.

That means there are additional problems, so I've decided to take another look. Your shell program returns from main! How can it return with no return address on the stack?

I don't mean to sound like I'm trying to oppose what your saying, I'm just trying to figure out how to solve this problem. I'm somewhat confused as to why everyone else in the class used this region of memory (that he told us to use, so I'm assuming his version uses this region as well) as well but had no problems but I am. I'm thinking the possibly simplest solution may just be to try to find a way to go to protected mode, which I'm not sure if that is what your proposing.

If its not I just ran a test:

If I'm not misunderstanding, the EBDA is an area of memory that contains data structures in certain parts of it. According to what you said, it should be free from the end of the kernel to 0x9FC00. The wiki page you linked said that there is a guaranteed space of free memory at 0x7E00, I tried loading the program there and still found issues.

Also Im trying to figure out how the shell returned (could it be one of the interrupts putting something into AL?) I put no return's in the shell program.

I don't mean to sound like I'm trying to oppose what your saying, I'm just trying to figure out how to solve this problem. I'm somewhat confused as to why everyone else in the class used this region of memory (that he told us to use, so I'm assuming his version uses this region as well) as well but had no problems but I am.

I might not have made myself clear. Placing the stack in the EBDA is just one of the problems I found, but I don't know if it's the reason why your program doesn't work.

cgbsu wrote:

I'm thinking the possibly simplest solution may just be to try to find a way to go to protected mode, which I'm not sure if that is what your proposing.

I'm not. Switching to protected mode is not simple either; I think you can find an easier solution.

cgbsu wrote:

Also Im trying to figure out how the shell returned (could it be one of the interrupts putting something into AL?) I put no return's in the shell program.

In C, the return statement is optional for functions that return void. A return statement is implied at the end of the function.

I don't mean to sound like I'm trying to oppose what your saying, I'm just trying to figure out how to solve this problem. I'm somewhat confused as to why everyone else in the class used this region of memory (that he told us to use, so I'm assuming his version uses this region as well) as well but had no problems but I am.

I might not have made myself clear. Placing the stack in the EBDA is just one of the problems I found, but I don't know if it's the reason why your program doesn't work.

cgbsu wrote:

I'm thinking the possibly simplest solution may just be to try to find a way to go to protected mode, which I'm not sure if that is what your proposing.

I'm not. Switching to protected mode is not simple either; I think you can find an easier solution.

I understand now, I think it most likely is not the reason, at least not entirely given that it is as difficult to enter protected mode as you said and the EBDA is 0x0 to 0x000FFFFF according to the wiki page and other sources, and 0xFFFF is the max sector addressable by a 16 bit value inputted into int 13. If not going into protected mode, there isen't a way to not write within the EBDA (if you're going to write something).

Octocontrabass wrote:

cgbsu wrote:

Also Im trying to figure out how the shell returned (could it be one of the interrupts putting something into AL?) I put no return's in the shell program.

In C, the return statement is optional for functions that return void. A return statement is implied at the end of the function.

As for the return, bcc is being used so main has no return type.

Code:

main() { /*Code goes here.*/}

I thought this may be a semantic difference put there on purpose to imply that main is not returning, but it may be, if so I guess I would need somewhere to put that data?

I understand now, I think it most likely is not the reason, at least not entirely given that it is as difficult to enter protected mode as you said and the EBDA is 0x0 to 0x000FFFFF according to the wiki page and other sources, and 0xFFFF is the max sector addressable by a 16 bit value inputted into int 13. If not going into protected mode, there isen't a way to not write within the EBDA (if you're going to write something).

I'm not sure what you're talking about. The EBDA is 0x9FC00 to 0x9FFFF in your simulator, with similar addresses on other computers. Most of the rest of memory, from 0x600 to 0x9FBFF, is free for your OS and programs. Sector addresses are irrelevant here since these are memory addresses.

cgbsu wrote:

As for the return, bcc is being used so main has no return type.

Code:

main() { /*Code goes here.*/}

I thought this may be a semantic difference put there on purpose to imply that main is not returning, but it may be, if so I guess I would need somewhere to put that data?

In a more complete OS, the C library would provide a wrapper function that calls main, so main has something to return to. The wrapper function doesn't return. Instead, it uses a system call to tell the kernel to end the program after main returns.

Since you don't have a wrapper function like that, you can't let main return.

I understand now, I think it most likely is not the reason, at least not entirely given that it is as difficult to enter protected mode as you said and the EBDA is 0x0 to 0x000FFFFF according to the wiki page and other sources, and 0xFFFF is the max sector addressable by a 16 bit value inputted into int 13. If not going into protected mode, there isen't a way to not write within the EBDA (if you're going to write something).

I'm not sure what you're talking about. The EBDA is 0x9FC00 to 0x9FFFF in your simulator, with similar addresses on other computers. Most of the rest of memory, from 0x600 to 0x9FBFF, is free for your OS and programs. Sector addresses are irrelevant here since these are memory addresses.

I was viewing it incorrectly, I thought 0x0 to 0xFFFFF was the EBDA and 0x04 - 0x0497 was the BDA with 0x0 to 0xA0000 being the part with the most stuff crammed into it (basically I was thinking as the EBDA as the larger category encompassing these things) -- my bad.

Octocontrabass wrote:

cgbsu wrote:

As for the return, bcc is being used so main has no return type.

Code:

main() { /*Code goes here.*/}

I thought this may be a semantic difference put there on purpose to imply that main is not returning, but it may be, if so I guess I would need somewhere to put that data?

In a more complete OS, the C library would provide a wrapper function that calls main, so main has something to return to. The wrapper function doesn't return. Instead, it uses a system call to tell the kernel to end the program after main returns.

Since you don't have a wrapper function like that, you can't let main return.

There are commented out/commented while( 1 ); 's at the end of both the kernel and shell's mains, the gunk that's in both main procedures is test code and I have commented it out and uncommented it a bunch. Both while( 1 )'s (which are there for similar reasons, though the professor told us it had to do with interpreting the next piece of memory as an instruction) have been uncommented simultaneously usually just giving different undefined behavior.

There are commented out/commented while( 1 ); 's at the end of both the kernel and shell's mains, the gunk that's in both main procedures is test code and I have commented it out and uncommented it a bunch. Both while( 1 )'s (which are there for similar reasons, though the professor told us it had to do with interpreting the next piece of memory as an instruction) have been uncommented simultaneously usually just giving different undefined behavior.

Those infinite loops prevent the main function from returning with nothing to return to. You should leave them uncommented.

Aside from that and the EBDA thing, I didn't see any other problems. I'd like to see a disk image rebuilt to fix those two issues, but I'm away from my development system for the rest of the week so I wouldn't be able to debug it until then.

There are commented out/commented while( 1 ); 's at the end of both the kernel and shell's mains, the gunk that's in both main procedures is test code and I have commented it out and uncommented it a bunch. Both while( 1 )'s (which are there for similar reasons, though the professor told us it had to do with interpreting the next piece of memory as an instruction) have been uncommented simultaneously usually just giving different undefined behavior.

Those infinite loops prevent the main function from returning with nothing to return to. You should leave them uncommented.

Aside from that and the EBDA thing, I didn't see any other problems. I'd like to see a disk image rebuilt to fix those two issues, but I'm away from my development system for the rest of the week so I wouldn't be able to debug it until then.

Who is online

Users browsing this forum: Bing [Bot] and 9 guests

You cannot post new topics in this forumYou cannot reply to topics in this forumYou cannot edit your posts in this forumYou cannot delete your posts in this forumYou cannot post attachments in this forum