| Doc. no. | N2702=08-0212 |
| Date: | 2008-07-27 |
| Project: | Programming Language C++ |
| Reply to: | Howard Hinnant <howard.hinnant@gmail.com> |
Reference ISO/IEC IS 14882:1998(E)
Also see:
The purpose of this document is to record the status of issues which have come before the Library Working Group (LWG) of the ANSI (J16) and ISO (WG21) C++ Standards Committee. Issues represent potential defects in the ISO/IEC IS 14882:1998(E) document. Issues are not to be used to request new features.
This document contains only library issues which are actively being considered by the Library Working Group. That is, issues which have a status of New, Open, Ready, and Review. See Library Defect Reports List for issues considered defects and Library Closed Issues List for issues considered closed.
The issues in these lists are not necessarily formal ISO Defect Reports (DR's). While some issues will eventually be elevated to official Defect Report status, other issues will be disposed of in other ways. See Issue Status.
This document is in an experimental format designed for both viewing via a world-wide web browser and hard-copy printing. It is available as an HTML file for browsing or PDF file for printing.
Prior to Revision 14, library issues lists existed in two slightly different versions; a Committee Version and a Public Version. Beginning with Revision 14 the two versions were combined into a single version.
This document includes [bracketed italicized notes] as a reminder to the LWG of current progress on issues. Such notes are strictly unofficial and should be read with caution as they may be incomplete or incorrect. Be aware that LWG support for a particular resolution can quickly change if new viewpoints or killer examples are presented in subsequent discussions.
For the most current official version of this document see http://www.open-std.org/jtc1/sc22/wg21/. Requests for further information about this document should include the document number above, reference ISO/IEC 14882:1998(E), and be submitted to Information Technology Industry Council (ITI), 1250 Eye Street NW, Washington, DC 20005.
Public information as to how to obtain a copy of the C++ Standard, join the standards committee, submit an issue, or comment on an issue can be found in the comp.std.c++ FAQ. Public discussion of C++ Standard related issues occurs on news:comp.std.c++.
For committee members, files available on the committee's private web site include the HTML version of the Standard itself. HTML hyperlinks from this issues list to those files will only work for committee members who have downloaded them into the same disk directory as the issues list files.
New - The issue has not yet been reviewed by the LWG. Any Proposed Resolution is purely a suggestion from the issue submitter, and should not be construed as the view of LWG.
Open - The LWG has discussed the issue but is not yet ready to move the issue forward. There are several possible reasons for open status:
A Proposed Resolution for an open issue is still not be construed as the view of LWG. Comments on the current state of discussions are often given at the end of open issues in an italic font. Such comments are for information only and should not be given undue importance.
Dup - The LWG has reached consensus that the issue is a duplicate of another issue, and will not be further dealt with. A Rationale identifies the duplicated issue's issue number.
NAD - The LWG has reached consensus that the issue is not a defect in the Standard.
NAD Editorial - The LWG has reached consensus that the issue can either be handled editorially, or is handled by a paper (usually linked to in the rationale).
NAD Future - In addition to the regular status, the LWG believes that this issue should be revisited at the next revision of the standard.
Review - Exact wording of a Proposed Resolution is now available for review on an issue for which the LWG previously reached informal consensus.
Tentatively Ready - The issue has been reviewed online, but not in a meeting, and some support has been formed for the proposed resolution. Tentatively Ready issues may be moved to Ready and forwarded to full committee within the same meeting. Unlike Ready issues they will be reviewed in subcommittee prior to forwarding to full committee.
Ready - The LWG has reached consensus that the issue is a defect in the Standard, the Proposed Resolution is correct, and the issue is ready to forward to the full committee for further action as a Defect Report (DR).
DR - (Defect Report) - The full J16 committee has voted to forward the issue to the Project Editor to be processed as a Potential Defect Report. The Project Editor reviews the issue, and then forwards it to the WG21 Convenor, who returns it to the full committee for final disposition. This issues list accords the status of DR to all these Defect Reports regardless of where they are in that process.
TC - (Technical Corrigenda) - The full WG21 committee has voted to accept the Defect Report's Proposed Resolution as a Technical Corrigenda. Action on this issue is thus complete and no further action is possible under ISO rules.
TRDec - (Decimal TR defect) - The LWG has voted to accept the Defect Report's Proposed Resolution into the Decimal TR. Action on this issue is thus complete and no further action is expected.
WP - (Working Paper) - The proposed resolution has not been accepted as a Technical Corrigendum, but the full WG21 committee has voted to apply the Defect Report's Proposed Resolution to the working paper.
Pending - This is a status qualifier. When prepended to a status this indicates the issue has been processed by the committee, and a decision has been made to move the issue to the associated unqualified status. However for logistical reasons the indicated outcome of the issue has not yet appeared in the latest working paper.
Issues are always given the status of New when they first appear on the issues list. They may progress to Open or Review while the LWG is actively working on them. When the LWG has reached consensus on the disposition of an issue, the status will then change to Dup, NAD, or Ready as appropriate. Once the full J16 committee votes to forward Ready issues to the Project Editor, they are given the status of Defect Report ( DR). These in turn may become the basis for Technical Corrigenda (TC), or are closed without action other than a Record of Response (RR ). The intent of this LWG process is that only issues which are truly defects in the Standard move to the formal ISO DR status.
Section: 22.2.2.1.2 [facet.num.get.virtuals] Status: Review Submitter: Nathan Myers Date: 1998-08-06
View other active issues in [facet.num.get.virtuals].
View all other issues in [facet.num.get.virtuals].
View all issues with Review status.
Discussion:
The current description of numeric input does not account for the possibility of overflow. This is an implicit result of changing the description to rely on the definition of scanf() (which fails to report overflow), and conflicts with the documented behavior of traditional and current implementations.
Users expect, when reading a character sequence that results in a value unrepresentable in the specified type, to have an error reported. The standard as written does not permit this.
Further comments from Dietmar:
I don't feel comfortable with the proposed resolution to issue 23: It kind of simplifies the issue to much. Here is what is going on:
Currently, the behavior of numeric overflow is rather counter intuitive and hard to trace, so I will describe it briefly:
Further discussion from Redmond:
The basic problem is that we've defined our behavior, including our error-reporting behavior, in terms of C90. However, C90's method of reporting overflow in scanf is not technically an "input error". The strto_* functions are more precise.
There was general consensus that failbit should be set upon overflow. We considered three options based on this:
Straw poll: (1) 5; (2) 0; (3) 8.
Discussed at Lillehammer. General outline of what we want the solution to look like: we want to say that overflow is an error, and provide a way to distinguish overflow from other kinds of errors. Choose candidate field the same way scanf does, but don't describe the rest of the process in terms of format. If a finite input field is too large (positive or negative) to be represented as a finite value, then set failbit and assign the nearest representable value. Bill will provide wording.
Discussed at Toronto: N2327 is in alignment with the direction we wanted to go with in Lillehammer. Bill to work on.
Proposed resolution:
Change 22.2.2.1.2 [facet.num.get.virtuals], end of p3:
Stage 3:
The result of stage 2 processing can be one ofThe sequence of chars accumulated in stage 2 (the field) is converted to a numeric value by the rules of one of the functions declared in the header <cstdlib>:
A sequence of chars has been accumulated in stage 2 that is converted (according to the rules of scanf) to a value of the type of val. This value is stored in val and ios_base::goodbit is stored in err.For a signed integer value, the function strtoll.The sequence of chars accumulated in stage 2 would have caused scanf to report an input failure. ios_base::failbit is assigned to err.For an unsigned integer value, the function strtoull.- For a floating-point value, the function strtold.
The numeric value to be stored can be one of:
- zero, if the conversion function fails to convert the entire field. ios_base::failbit is assigned to err.
- the most positive representable value, if the field represents a value too large positive to be represented in val. ios_base::failbit is assigned to err.
- the most negative representable value (zero for unsigned integer), if the field represents a value too large negative to be represented in val. ios_base::failbit is assigned to err.
- the converted value, otherwise.
The resultant numeric value is stored in val.
Change 22.2.2.1.2 [facet.num.get.virtuals], p6-p7:
iter_type do_get(iter_type in, iter_type end, ios_base& str, ios_base::iostate& err, bool& val) const;-6- Effects: If (str.flags()&ios_base::boolalpha)==0 then input proceeds as it would for a long except that if a value is being stored into val, the value is determined according to the following: If the value to be stored is 0 then false is stored. If the value is 1 then true is stored. Otherwise
err|=ios_base::failbit is performed and no valuetrue is stored.and ios_base::failbit is assigned to err.-7- Otherwise target sequences are determined "as if" by calling the members falsename() and truename() of the facet obtained by use_facet<numpunct<charT> >(str.getloc()). Successive characters in the range [in,end) (see 23.1.1) are obtained and matched against corresponding positions in the target sequences only as necessary to identify a unique match. The input iterator in is compared to end only when necessary to obtain a character. If
and only ifa target sequence is uniquely matched, val is set to the corresponding value. Otherwise false is stored and ios_base::failbit is assigned to err.
Section: 23.2.6 [vector] Status: Open Submitter: AFNOR Date: 1998-10-07
View all other issues in [vector].
View all issues with Open status.
Discussion:
vector<bool> is not a container as its reference and pointer types are not references and pointers.
Also it forces everyone to have a space optimization instead of a speed one.
See also: 99-0008 == N1185 Vector<bool> is Nonconforming, Forces Optimization Choice.
[In Santa Cruz the LWG felt that this was Not A Defect.]
[In Dublin many present felt that failure to meet Container requirements was a defect. There was disagreement as to whether or not the optimization requirements constituted a defect.]
[The LWG looked at the following resolutions in some detail:
* Not A Defect.
* Add a note explaining that vector<bool> does not meet
Container requirements.
* Remove vector<bool>.
* Add a new category of container requirements which
vector<bool> would meet.
* Rename vector<bool>.
No alternative had strong, wide-spread, support and every alternative
had at least one "over my dead body" response.
There was also mention of a transition scheme something like (1) add
vector_bool and deprecate vector<bool> in the next standard. (2)
Remove vector<bool> in the following standard.]
[Modifying container requirements to permit returning proxies (thus allowing container requirements conforming vector<bool>) was also discussed.]
[It was also noted that there is a partial but ugly workaround in that vector<bool> may be further specialized with a customer allocator.]
[Kona: Herb Sutter presented his paper J16/99-0035==WG21/N1211, vector<bool>: More Problems, Better Solutions. Much discussion of a two step approach: a) deprecate, b) provide replacement under a new name. LWG straw vote on that: 1-favor, 11-could live with, 2-over my dead body. This resolution was mentioned in the LWG report to the full committee, where several additional committee members indicated over-my-dead-body positions.]
Discussed at Lillehammer. General agreement that we should deprecate vector<bool> and introduce this functionality under a different name, e.g. bit_vector. This might make it possible to remove the vector<bool> specialization in the standard that comes after C++0x. There was also a suggestion that in C++0x we could additional say that it's implementation defined whether vector<bool> refers to the specialization or to the primary template, but there wasn't general agreement that this was a good idea.
We need a paper for the new bit_vector class.
Proposed resolution:
[ Batavia: The LWG feels we need something closer to SGI's bitvector to ease migration from vector<bool>. Although some of the funcitonality from N2050 could well be used in such a template. The concern is easing the API migration for those users who want to continue using a bit-packed container. Alan and Beman to work. ]
Section: 27.7 [string.streams], 27.8 [file.streams] Status: Open Submitter: Angelika Langer Date: 1999-02-22
View all other issues in [string.streams].
View all issues with Open status.
Discussion:
The following question came from Thorsten Herlemann:
You can set a mode when constructing or opening a file-stream or filebuf, e.g. ios::in, ios::out, ios::binary, ... But how can I get that mode later on, e.g. in my own operator << or operator >> or when I want to check whether a file-stream or file-buffer object passed as parameter is opened for input or output or binary? Is there no possibility? Is this a design-error in the standard C++ library?
It is indeed impossible to find out what a stream's or stream buffer's open mode is, and without that knowledge you don't know how certain operations behave. Just think of the append mode.
Both streams and stream buffers should have a mode() function that returns the current open mode setting.
[ post Bellevue: Alisdair requested to re-Open. ]
Proposed resolution:
For stream buffers, add a function to the base class as a non-virtual function qualified as const to 27.5.2 [streambuf]:
openmode mode() const;
Returns the current open mode.
With streams, I'm not sure what to suggest. In principle, the mode could already be returned by ios_base, but the mode is only initialized for file and string stream objects, unless I'm overlooking anything. For this reason it should be added to the most derived stream classes. Alternatively, it could be added to basic_ios and would be default initialized in basic_ios<>::init().
Rationale:
This might be an interesting extension for some future, but it is not a defect in the current standard. The Proposed Resolution is retained for future reference.
Section: 21.3 [basic.string] Status: Ready Submitter: Dave Abrahams Date: 1999-07-01
View other active issues in [basic.string].
View all other issues in [basic.string].
View all issues with Ready status.
Discussion:
It is the constness of the container which should control whether it can be modified through a member function such as erase(), not the constness of the iterators. The iterators only serve to give positioning information.
Here's a simple and typical example problem which is currently very difficult or impossible to solve without the change proposed below.
Wrap a standard container C in a class W which allows clients to find and read (but not modify) a subrange of (C.begin(), C.end()]. The only modification clients are allowed to make to elements in this subrange is to erase them from C through the use of a member function of W.
[ post Bellevue, Alisdair adds: ]
This issue was implemented by N2350 for everything but basic_string.
Note that the specific example in this issue (basic_string) is the one place we forgot to amend in N2350, so we might open this issue for that single container?
[ Sophia Antipolis: ]
This was a fix that was intended for all standard library containers, and has been done for other containers, but string was missed.
The wording updated.
We did not make the change in replace, because this change would affect the implementation because the string may be written into. This is an issue that should be taken up by concepts.
We note that the supplied wording addresses the initializer list provided in N2679.
Proposed resolution:
Update the following signature in the basic_string class template definition in 21.3 [basic.string], p5:
namespace std {
template<class charT, class traits = char_traits<charT>,
class Allocator = allocator<charT> >
class basic_string {
...
iterator insert(const_iterator p, charT c);
void insert(const_iterator p, size_type n, charT c);
template<class InputIterator>
void insert(const_iterator p, InputIterator first, InputIterator last);
void insert(const_iterator p, initializer_list<charT>);
...
iterator erase(const_iterator const_position);
iterator erase(const_iterator first, const_iterator last);
...
};
}
Update the following signatures in 21.3.6.4 [string::insert]:
iterator insert(const_iterator p, charT c); void insert(const_iterator p, size_type n, charT c); template<class InputIterator> void insert(const_iterator p, InputIterator first, InputIterator last); void insert(const_iterator p, initializer_list<charT>);
Update the following signatures in 21.3.6.5 [string::erase]:
iterator erase(const_iterator const_position); iterator erase(const_iterator first, const_iterator last);
Rationale:
The issue was discussed at length. It was generally agreed that 1) There is no major technical argument against the change (although there is a minor argument that some obscure programs may break), and 2) Such a change would not break const correctness. The concerns about making the change were 1) it is user detectable (although only in boundary cases), 2) it changes a large number of signatures, and 3) it seems more of a design issue that an out-and-out defect.
The LWG believes that this issue should be considered as part of a general review of const issues for the next revision of the standard. Also see issue 200.
Section: 25.3.7 [alg.min.max] Status: Open Submitter: Mark Rintoul Date: 1999-08-26
View all other issues in [alg.min.max].
View all issues with Open status.
Discussion:
Both std::min and std::max are defined as template functions. This
is very different than the definition of std::plus (and similar
structs) which are defined as function objects which inherit
std::binary_function.
This lack of inheritance leaves std::min and std::max somewhat useless in standard library algorithms which require
a function object that inherits std::binary_function.
[ post Bellevue: Alisdair requested to re-Open. ]
Rationale:
Although perhaps an unfortunate design decision, the omission is not a defect in the current standard. A future standard may wish to consider additional function objects.
Section: 27.5.2 [streambuf] Status: Open Submitter: Martin Sebor Date: 2000-08-12
View all other issues in [streambuf].
View all issues with Open status.
Discussion:
The basic_streambuf members gbump() and pbump() are specified to take an int argument. This requirement prevents the functions from effectively manipulating buffers larger than std::numeric_limits<int>::max() characters. It also makes the common use case for these functions somewhat difficult as many compilers will issue a warning when an argument of type larger than int (such as ptrdiff_t on LLP64 architectures) is passed to either of the function. Since it's often the result of the subtraction of two pointers that is passed to the functions, a cast is necessary to silence such warnings. Finally, the usage of a native type in the functions signatures is inconsistent with other member functions (such as sgetn() and sputn()) that manipulate the underlying character buffer. Those functions take a streamsize argument.
Proposed resolution:
Change the signatures of these functions in the synopsis of template class basic_streambuf (27.5.2) and in their descriptions (27.5.2.3.1, p4 and 27.5.2.3.2, p4) to take a streamsize argument.
Although this change has the potential of changing the ABI of the library, the change will affect only platforms where int is different than the definition of streamsize. However, since both functions are typically inline (they are on all known implementations), even on such platforms the change will not affect any user code unless it explicitly relies on the existing type of the functions (e.g., by taking their address). Such a possibility is IMO quite remote.
Alternate Suggestion from Howard Hinnant, c++std-lib-7780:
This is something of a nit, but I'm wondering if streamoff wouldn't be a better choice than streamsize. The argument to pbump and gbump MUST be signed. But the standard has this to say about streamsize (27.4.1/2/Footnote):
[Footnote: streamsize is used in most places where ISO C would use size_t. Most of the uses of streamsize could use size_t, except for the strstreambuf constructors, which require negative values. It should probably be the signed type corresponding to size_t (which is what Posix.2 calls ssize_t). --- end footnote]
This seems a little weak for the argument to pbump and gbump. Should we ever really get rid of strstream, this footnote might go with it, along with the reason to make streamsize signed.
Rationale:
The LWG believes this change is too big for now. We may wish to reconsider this for a future revision of the standard. One possibility is overloading pbump, rather than changing the signature.
[ [2006-05-04: Reopened at the request of Chris (Krzysztof ?elechowski)] ]
Section: 25.1.1 [alg.foreach] Status: Open Submitter: Angelika Langer Date: 2001-01-03
View all other issues in [alg.foreach].
View all issues with Open status.
Discussion:
The specification of the for_each algorithm does not have a "Requires" section, which means that there are no restrictions imposed on the function object whatsoever. In essence it means that I can provide any function object with arbitrary side effects and I can still expect a predictable result. In particular I can expect that the function object is applied exactly last - first times, which is promised in the "Complexity" section.
I don't see how any implementation can give such a guarantee without imposing requirements on the function object.
Just as an example: consider a function object that removes elements from the input sequence. In that case, what does the complexity guarantee (applies f exactly last - first times) mean?
One can argue that this is obviously a nonsensical application and a theoretical case, which unfortunately it isn't. I have seen programmers shooting themselves in the foot this way, and they did not understand that there are restrictions even if the description of the algorithm does not say so.
[Lillehammer: This is more general than for_each. We don't want the function object in transform invalidiating iterators either. There should be a note somewhere in clause 17 (17, not 25) saying that user code operating on a range may not invalidate iterators unless otherwise specified. Bill will provide wording.]
Proposed resolution:
Section: 24.1.4 [bidirectional.iterators], 24.1.5 [random.access.iterators] Status: Open Submitter: John Potter Date: 2001-01-22
View all other issues in [bidirectional.iterators].
View all issues with Open status.
Discussion:
In section 24.1.4 [bidirectional.iterators], Table 75 gives the return type of *r-- as convertible to T. This is not consistent with Table 74 which gives the return type of *r++ as T&. *r++ = t is valid while *r-- = t is invalid.
In section 24.1.5 [random.access.iterators], Table 76 gives the return type of a[n] as convertible to T. This is not consistent with the semantics of *(a + n) which returns T& by Table 74. *(a + n) = t is valid while a[n] = t is invalid.
Discussion from the Copenhagen meeting: the first part is uncontroversial. The second part, operator[] for Random Access Iterators, requires more thought. There are reasonable arguments on both sides. Return by value from operator[] enables some potentially useful iterators, e.g. a random access "iota iterator" (a.k.a "counting iterator" or "int iterator"). There isn't any obvious way to do this with return-by-reference, since the reference would be to a temporary. On the other hand, reverse_iterator takes an arbitrary Random Access Iterator as template argument, and its operator[] returns by reference. If we decided that the return type in Table 76 was correct, we would have to change reverse_iterator. This change would probably affect user code.
History: the contradiction between reverse_iterator and the Random Access Iterator requirements has been present from an early stage. In both the STL proposal adopted by the committee (N0527==94-0140) and the STL technical report (HPL-95-11 (R.1), by Stepanov and Lee), the Random Access Iterator requirements say that operator[]'s return value is "convertible to T". In N0527 reverse_iterator's operator[] returns by value, but in HPL-95-11 (R.1), and in the STL implementation that HP released to the public, reverse_iterator's operator[] returns by reference. In 1995, the standard was amended to reflect the contents of HPL-95-11 (R.1). The original intent for operator[] is unclear.
In the long term it may be desirable to add more fine-grained iterator requirements, so that access method and traversal strategy can be decoupled. (See "Improved Iterator Categories and Requirements", N1297 = 01-0011, by Jeremy Siek.) Any decisions about issue 299 should keep this possibility in mind.
Further discussion: I propose a compromise between John Potter's resolution, which requires T& as the return type of a[n], and the current wording, which requires convertible to T. The compromise is to keep the convertible to T for the return type of the expression a[n], but to also add a[n] = t as a valid expression. This compromise "saves" the common case uses of random access iterators, while at the same time allowing iterators such as counting iterator and caching file iterators to remain random access iterators (iterators where the lifetime of the object returned by operator*() is tied to the lifetime of the iterator).
Note that the compromise resolution necessitates a change to reverse_iterator. It would need to use a proxy to support a[n] = t.
Note also there is one kind of mutable random access iterator that will no longer meet the new requirements. Currently, iterators that return an r-value from operator[] meet the requirements for a mutable random access iterartor, even though the expression a[n] = t will only modify a temporary that goes away. With this proposed resolution, a[n] = t will be required to have the same operational semantics as *(a + n) = t.
Proposed resolution:
In section 24.1.4 [lib.bidirectdional.iterators], change the return type in table 75 from "convertible to T" to T&.
In section 24.1.5 [lib.random.access.iterators], change the operational semantics for a[n] to " the r-value of a[n] is equivalent to the r-value of *(a + n)". Add a new row in the table for the expression a[n] = t with a return type of convertible to T and operational semantics of *(a + n) = t.
[Lillehammer: Real problem, but should be addressed as part of iterator redesign]
Section: 27.6 [iostream.format] Status: Open Submitter: Martin Sebor Date: 2001-03-19
View all other issues in [iostream.format].
View all issues with Open status.
Discussion:
The descriptions of the constructors of basic_istream<>::sentry (27.6.1.1.3 [istream::sentry]) and basic_ostream<>::sentry (27.6.2.4 [ostream::sentry]) do not explain what the functions do in case an exception is thrown while they execute. Some current implementations allow all exceptions to propagate, others catch them and set ios_base::badbit instead, still others catch some but let others propagate.
The text also mentions that the functions may call setstate(failbit) (without actually saying on what object, but presumably the stream argument is meant). That may have been fine for basic_istream<>::sentry prior to issue 195, since the function performs an input operation which may fail. However, issue 195 amends 27.6.1.1.3 [istream::sentry], p2 to clarify that the function should actually call setstate(failbit | eofbit), so the sentence in p3 is redundant or even somewhat contradictory.
The same sentence that appears in 27.6.2.4 [ostream::sentry], p3 doesn't seem to be very meaningful for basic_istream<>::sentry which performs no input. It is actually rather misleading since it would appear to guide library implementers to calling setstate(failbit) when os.tie()->flush(), the only called function, throws an exception (typically, it's badbit that's set in response to such an event).
Additional comments from Martin, who isn't comfortable with the current proposed resolution (see c++std-lib-11530)
The istream::sentry ctor says nothing about how the function deals with exemptions (27.6.1.1.2, p1 says that the class is responsible for doing "exception safe"(*) prefix and suffix operations but it doesn't explain what level of exception safety the class promises to provide). The mockup example of a "typical implementation of the sentry ctor" given in 27.6.1.1.2, p6, removed in ISO/IEC 14882:2003, doesn't show exception handling, either. Since the ctor is not classified as a formatted or unformatted input function, the text in 27.6.1.1, p1 through p4 does not apply. All this would seem to suggest that the sentry ctor should not catch or in any way handle exceptions thrown from any functions it may call. Thus, the typical implementation of an istream extractor may look something like [1].
The problem with [1] is that while it correctly sets ios::badbit if an exception is thrown from one of the functions called from the sentry ctor, if the sentry ctor reaches EOF while extracting whitespace from a stream that has eofbit or failbit set in exceptions(), it will cause an ios::failure to be thrown, which will in turn cause the extractor to set ios::badbit.
The only straightforward way to prevent this behavior is to move the definition of the sentry object in the extractor above the try block (as suggested by the example in 22.2.8, p9 and also indirectly supported by 27.6.1.3, p1). See [2]. But such an implementation will allow exceptions thrown from functions called from the ctor to freely propagate to the caller regardless of the setting of ios::badbit in the stream object's exceptions().
So since neither [1] nor [2] behaves as expected, the only possible solution is to have the sentry ctor catch exceptions thrown from called functions, set badbit, and propagate those exceptions if badbit is also set in exceptions(). (Another solution exists that deals with both kinds of sentries, but the code is non-obvious and cumbersome -- see [3].)
Please note that, as the issue points out, current libraries do not behave consistently, suggesting that implementors are not quite clear on the exception handling in istream::sentry, despite the fact that some LWG members might feel otherwise. (As documented by the parenthetical comment here: http://anubis.dkuug.dk/jtc1/sc22/wg21/docs/papers/2003/n1480.html#309)
Also please note that those LWG members who in Copenhagen felt that "a sentry's constructor should not catch exceptions, because sentries should only be used within (un)formatted input functions and that exception handling is the responsibility of those functions, not of the sentries," as noted here http://anubis.dkuug.dk/jtc1/sc22/wg21/docs/papers/2001/n1310.html#309 would in effect be either arguing for the behavior described in [1] or for extractors implemented along the lines of [3].
The original proposed resolution (Revision 25 of the issues list) clarifies the role of the sentry ctor WRT exception handling by making it clear that extractors (both library or user-defined) should be implemented along the lines of [2] (as opposed to [1]) and that no exception thrown from the callees should propagate out of either function unless badbit is also set in exceptions().
[1] Extractor that catches exceptions thrown from sentry:
struct S { long i; };
istream& operator>> (istream &strm, S &s)
{
ios::iostate err = ios::goodbit;
try {
const istream::sentry guard (strm, false);
if (guard) {
use_facet<num_get<char> >(strm.getloc ())
.get (istreambuf_iterator<char>(strm),
istreambuf_iterator<char>(),
strm, err, s.i);
}
}
catch (...) {
bool rethrow;
try {
strm.setstate (ios::badbit);
rethrow = false;
}
catch (...) {
rethrow = true;
}
if (rethrow)
throw;
}
if (err)
strm.setstate (err);
return strm;
}
[2] Extractor that propagates exceptions thrown from sentry:
istream& operator>> (istream &strm, S &s)
{
istream::sentry guard (strm, false);
if (guard) {
ios::iostate err = ios::goodbit;
try {
use_facet<num_get<char> >(strm.getloc ())
.get (istreambuf_iterator<char>(strm),
istreambuf_iterator<char>(),
strm, err, s.i);
}
catch (...) {
bool rethrow;
try {
strm.setstate (ios::badbit);
rethrow = false;
}
catch (...) {
rethrow = true;
}
if (rethrow)
throw;
}
if (err)
strm.setstate (err);
}
return strm;
}
[3] Extractor that catches exceptions thrown from sentry but doesn't set badbit if the exception was thrown as a result of a call to strm.clear().
istream& operator>> (istream &strm, S &s)
{
const ios::iostate state = strm.rdstate ();
const ios::iostate except = strm.exceptions ();
ios::iostate err = std::ios::goodbit;
bool thrown = true;
try {
const istream::sentry guard (strm, false);
thrown = false;
if (guard) {
use_facet<num_get<char> >(strm.getloc ())
.get (istreambuf_iterator<char>(strm),
istreambuf_iterator<char>(),
strm, err, s.i);
}
}
catch (...) {
if (thrown && state & except)
throw;
try {
strm.setstate (ios::badbit);
thrown = false;
}
catch (...) {
thrown = true;
}
if (thrown)
throw;
}
if (err)
strm.setstate (err);
return strm;
}
[Pre-Berlin] Reopened at the request of Paolo Carlini and Steve Clamage.
[Pre-Portland] A relevant newsgroup post:
The current proposed resolution of issue #309 (http://www.open-std.org/jtc1/sc22/wg21/docs/lwg-active.html#309) is unacceptable. I write commerical software and coding around this makes my code ugly, non-intuitive, and requires comments referring people to this very issue. Following is the full explanation of my experience.
In the course of writing software for commercial use, I constructed std::ifstream's based on user-supplied pathnames on typical POSIX systems.
It was expected that some files that opened successfully might not read successfully -- such as a pathname which actually refered to a directory. Intuitively, I expected the streambuffer underflow() code to throw an exception in this situation, and recent implementations of libstdc++'s basic_filebuf do just that (as well as many of my own custom streambufs).
I also intuitively expected that the istream code would convert these exceptions to the "badbit' set on the stream object, because I had not requested exceptions. I refer to 27.6.1.1. P4.
However, this was not the case on at least two implementations -- if the first thing I did with an istream was call operator>>( T& ) for T among the basic arithmetic types and std::string. Looking further I found that the sentry's constructor was invoking the exception when it pre-scanned for whitespace, and the extractor function (operator>>()) was not catching exceptions in this situation.
So, I was in a situation where setting 'noskipws' would change the istream's behavior even though no characters (whitespace or not) could ever be successfully read.
Also, calling .peek() on the istream before calling the extractor() changed the behavior (.peek() had the effect of setting the badbit ahead of time).
I found this all to be so inconsistent and inconvenient for me and my code design, that I filed a bugzilla entry for libstdc++. I was then told that the bug cannot be fixed until issue #309 is resolved by the committee.
Proposed resolution:
Rationale:
The LWG agrees there is minor variation between implementations, but believes that it doesn't matter. This is a rarely used corner case. There is no evidence that this has any commercial importance or that it causes actual portability problems for customers trying to write code that runs on multiple implementations.
Section: 27.6.1.3 [istream.unformatted] Status: Open Submitter: Howard Hinnant Date: 2001-10-09
View all other issues in [istream.unformatted].
View all issues with Open status.
Discussion:
I think we have a defect.
According to lwg issue 60 which is now a dr, the description of seekg in 27.6.1.3 [istream.unformatted] paragraph 38 now looks like:
Behaves as an unformatted input function (as described in 27.6.1.3, paragraph 1), except that it does not count the number of characters extracted and does not affect the value returned by subsequent calls to gcount(). After constructing a sentry object, if fail() != true, executes rdbuf()->pubseekpos( pos).
And according to lwg issue 243 which is also now a dr, 27.6.1.3, paragraph 1 looks like:
Each unformatted input function begins execution by constructing an object of class sentry with the default argument noskipws (second) argument true. If the sentry object returns true, when converted to a value of type bool, the function endeavors to obtain the requested input. Otherwise, if the sentry constructor exits by throwing an exception or if the sentry object returns false, when converted to a value of type bool, the function returns without attempting to obtain any input. In either case the number of extracted characters is set to 0; unformatted input functions taking a character array of non-zero size as an argument shall also store a null character (using charT()) in the first location of the array. If an exception is thrown during input then ios::badbit is turned on in *this'ss error state. If (exception()&badbit)!= 0 then the exception is rethrown. It also counts the number of characters extracted. If no exception has been thrown it ends by storing the count in a member object and returning the value specified. In any event the sentry object is destroyed before leaving the unformatted input function.
And finally 27.6.1.1.2/5 says this about sentry:
If, after any preparation is completed, is.good() is true, ok_ != false otherwise, ok_ == false.
So although the seekg paragraph says that the operation proceeds if !fail(), the behavior of unformatted functions says the operation proceeds only if good(). The two statements are contradictory when only eofbit is set. I don't think the current text is clear which condition should be respected.
Further discussion from Redmond:
PJP: It doesn't seem quite right to say that seekg is "unformatted". That makes specific claims about sentry that aren't quite appropriate for seeking, which has less fragile failure modes than actual input. If we do really mean that it's unformatted input, it should behave the same way as other unformatted input. On the other hand, "principle of least surprise" is that seeking from EOF ought to be OK.
Pre-Berlin: Paolo points out several problems with the proposed resolution in Ready state:
Proposed resolution:
Change 27.6.1.3 [istream.unformatted] to:
Behaves as an unformatted input function (as described in 27.6.1.3, paragraph 1), except that it does not count the number of characters extracted, does not affect the value returned by subsequent calls to gcount(), and does not examine the value returned by the sentry object. After constructing a sentry object, if fail() != true, executes rdbuf()->pubseekpos(pos). In case of success, the function calls clear(). In case of failure, the function calls setstate(failbit) (which may throw ios_base::failure).
[Lillehammer: Matt provided wording.]
Rationale:
In C, fseek does clear EOF. This is probably what most users would expect. We agree that having eofbit set should not deter a seek, and that a successful seek should clear eofbit. Note that fail() is true only if failbit or badbit is set, so using !fail(), rather than good(), satisfies this goal.
Section: 21 [strings], 23 [containers], 27 [input.output] Status: Open Submitter: Martin Sebor Date: 2001-10-09
View all other issues in [strings].
View all issues with Open status.
Discussion:
The synopses of the C++ library headers clearly show which names are required to be defined in each header. Since in order to implement the classes and templates defined in these headers declarations of other templates (but not necessarily their definitions) are typically necessary the standard in 17.4.4, p1 permits library implementers to include any headers needed to implement the definitions in each header.
For instance, although it is not explicitly specified in the synopsis of <string>, at the point of definition of the std::basic_string template the declaration of the std::allocator template must be in scope. All current implementations simply include <memory> from within <string>, either directly or indirectly, to bring the declaration of std::allocator into scope.
Additionally, however, some implementation also include <istream> and <ostream> at the top of <string> to bring the declarations of std::basic_istream and std::basic_ostream into scope (which are needed in order to implement the string inserter and extractor operators (21.3.7.9 [lib.string.io])). Other implementations only include <iosfwd>, since strictly speaking, only the declarations and not the full definitions are necessary.
Obviously, it is possible to implement <string> without actually providing the full definitions of all the templates std::basic_string uses (std::allocator, std::basic_istream, and std::basic_ostream). Furthermore, not only is it possible, doing so is likely to have a positive effect on compile-time efficiency.
But while it may seem perfectly reasonable to expect a program that uses the std::basic_string insertion and extraction operators to also explicitly include <istream> or <ostream>, respectively, it doesn't seem reasonable to also expect it to explicitly include <memory>. Since what's reasonable and what isn't is highly subjective one would expect the standard to specify what can and what cannot be assumed. Unfortunately, that isn't the case.
The examples below demonstrate the issue.
Example 1:
It is not clear whether the following program is complete:
#include <string>
extern std::basic_ostream<char> &strm;
int main () {
strm << std::string ("Hello, World!\n");
}
or whether one must explicitly include <memory> or <ostream> (or both) in addition to <string> in order for the program to compile.
Example 2:
Similarly, it is unclear whether the following program is complete:
#include <istream>
extern std::basic_iostream<char> &strm;
int main () {
strm << "Hello, World!\n";
}
or whether one needs to explicitly include <ostream>, and perhaps even other headers containing the definitions of other required templates:
#include <ios>
#include <istream>
#include <ostream>
#include <streambuf>
extern std::basic_iostream<char> &strm;
int main () {
strm << "Hello, World!\n";
}
Example 3:
Likewise, it seems unclear whether the program below is complete:
#include <iterator>
bool foo (std::istream_iterator<int> a, std::istream_iterator<int> b)
{
return a == b;
}
int main () { }
or whether one should be required to include <istream>.
There are many more examples that demonstrate this lack of a requirement. I believe that in a good number of cases it would be unreasonable to require that a program explicitly include all the headers necessary for a particular template to be specialized, but I think that there are cases such as some of those above where it would be desirable to allow implementations to include only as much as necessary and not more.
[ post Bellevue: ]
Position taken in prior reviews is that the idea of a table of header dependencies is a good one. Our view is that a full paper is needed to do justice to this, and we've made that recommendation to the issue author.
Proposed resolution:
For every C++ library header, supply a minimum set of other C++ library headers that are required to be included by that header. The proposed list is below (C++ headers for C Library Facilities, table 12 in 17.4.1.2, p3, are omitted):
+------------+--------------------+ | C++ header |required to include | +============+====================+ |<algorithm> | | +------------+--------------------+ |<bitset> | | +------------+--------------------+ |<complex> | | +------------+--------------------+ |<deque> |<memory> | +------------+--------------------+ |<exception> | | +------------+--------------------+ |<fstream> |<ios> | +------------+--------------------+ |<functional>| | +------------+--------------------+ |<iomanip> |<ios> | +------------+--------------------+ |<ios> |<streambuf> | +------------+--------------------+ |<iosfwd> | | +------------+--------------------+ |<iostream> |<istream>, <ostream>| +------------+--------------------+ |<istream> |<ios> | +------------+--------------------+ |<iterator> | | +------------+--------------------+ |<limits> | | +------------+--------------------+ |<list> |<memory> | +------------+--------------------+ |<locale> | | +------------+--------------------+ |<map> |<memory> | +------------+--------------------+ |<memory> | | +------------+--------------------+ |<new> |<exception> | +------------+--------------------+ |<numeric> | | +------------+--------------------+ |<ostream> |<ios> | +------------+--------------------+ |<queue> |<deque> | +------------+--------------------+ |<set> |<memory> | +------------+--------------------+ |<sstream> |<ios>, <string> | +------------+--------------------+ |<stack> |<deque> | +------------+--------------------+ |<stdexcept> | | +------------+--------------------+ |<streambuf> |<ios> | +------------+--------------------+ |<string> |<memory> | +------------+--------------------+ |<strstream> | | +------------+--------------------+ |<typeinfo> |<exception> | +------------+--------------------+ |<utility> | | +------------+--------------------+ |<valarray> | | +------------+--------------------+ |<vector> |<memory> | +------------+--------------------+
Rationale:
The portability problem is real. A program that works correctly on one implementation might fail on another, because of different header dependencies. This problem was understood before the standard was completed, and it was a conscious design choice.
One possible way to deal with this, as a library extension, would be an <all> header.
Hinnant: It's time we dealt with this issue for C++0X. Reopened.
Section: 22.2.1.4 [locale.codecvt] Status: Open Submitter: Martin Sebor Date: 2002-08-30
View all other issues in [locale.codecvt].
View all issues with Open status.
Discussion:
It seems that the descriptions of codecvt do_in() and do_out() leave sufficient room for interpretation so that two implementations of codecvt may not work correctly with the same filebuf. Specifically, the following seems less than adequately specified:
Finally, the conditions described at the end of 22.2.1.4.2 [locale.codecvt.virtuals], p4 don't seem to be possible:
"A return value of partial, if (from_next == from_end), indicates that either the destination sequence has not absorbed all the available destination elements, or that additional source elements are needed bef